Joomla/Mambo – PDF Indexer Module

PDF Indexer

Allow PDFs to be searched via the Joomla/Mambo search module.

This Joomla/Mambo Component allows you to index PDFs located within your Joomla directory and the corresponding mosbot allows that index to be searched using the Joomla search component. This allows the text of PDFs to be viewed when searching a Joomla site.

Version 2.4

New Features:
* Joomla 1.5 legacy support
* More bug fixes

Also Featuring:
* Indexes new pdfs only so indexing is much faster.
* PDF file version changes.  It will automatically detect if a PDF has changed and index it on the next pass.
* Delete indexes to PDFs that have been removed from your file structure.
* Password Protected PDF indexing!
* Ability to edit past indexes (For those image based pdfs, add keywords, phrases)
* Improved MosBot
* Other Various Bug Fixes

Does not work on servers in SafeMode or when Popen is off.

Great work! Really!
But i run into a big problem…

I have tested PDF Indexer with Joomla 1.5 and works PERFECT with “small” pdf files.
With “Small” pdf files I mean up to 1 MB.
When I tried a “bigger” pdf file like 10 MB or even worst 40MB or 80MB, although it seemed that it was working (that is, no errors found) when I tried to see it in “Modify Indexes” from the Administration Menu… it wasn’t there.


1. Edit the file…

Lines 366,445:
Change this…
$contents .= fread($handle2, 8192);
to this…
$contents .= fread($handle2, $fileSize);

then add the following line…
in the first lines of the file.

Alternative: If you have access, change the memory_limit = 32M to  memory_limit = 128M in your /etc/php.ini file. Restart apache !!!

2. Edit the file…
set-variable = max_allowed_packet=xM
where xM the needed MB (for example 5MB)!

Fire up phpMyAdmin or open your favorite MySQL Manager.
Go to the Joomla DATABASE and in the TABLE that stores the data for indexing edit the FIELD Description to LONGTEXT.
Restart mysql !!!

That’s all!

The results were great. In a few seconds a 78MB pdf file was indexed !

  • lsb

    we have big pb because none of ours PDF are indexed. Only firsts caracters are taking in account…

  • Rob

    Thanks for this explanation. You write this: “Go to the Joomla DATABASE and in the TABLE that stores the data for indexing edit the FIELD Description to LONGTEXT.”

    Can you help me, how to find that TABLE, what is the name of it?

    Help is appreciated!

    • Rob, the DATABASE you are looking for is the one you created when you installed Joomla! The TABLE name if you have followed default installation will probably be something like “jos_com_file_index“.

      Hope that helps!

  • Carlos

    Hi All:
    I’ve got a problem with PDF Indexer, my joomla web site has got full pdf files and I created a module search, I installed PDF indexer, but only it works good with pdfs without text or special characters, for example, if pdf files have got text with these á,Á,ñ, it shows nothing. How to do to resolve it?

  • Hernani

    I´ve done this and it still does´t work.
    I´m getting this message:


    * Array: File has not been uploaded


    * Warning! – Failed to move file.
    * Error. Unable to upload file (from to C:\(…)

    Any Clue ?

  • Damjan

    I have a website with two parts,one for user (require login) and other for guests. Is there any possibility to enable pdf search for non logged users (guests) since it will not work until you login???
    I have run through all the configuration files and still couldn`t find it….
    Tnx in advance