I see this confusion so often it seems worth addressing.
If you scan a page of text, what you have is a picture. A computer sees it not as letters, numbers, and punctuation?but as pixels, bits of light and shade and color, just like the pixels in your favorite family photo on Flickr.
You can’t search for, extract, highlight, or cut-and-paste such “text.” It doesn’t matter whether you embed the picture in a PDF; you still can’t search it. Ceci n’est pas une texte!
Compare this to creating a PDF from a word-processing or page-layout document. The computer already thinks of the text in these documents as text, so it can embed the text in the PDF as text. The text is thus searchable, extractable, and all that good stuff. (Within limits. PDF is horrible for text-mining, for reasons I may decide to discuss sometime.)
To make the text in a scanned picture searchable, you must use Optical Character Recognition (OCR) technology on the picture. OCR tools look at the picture and try to figure out what letters, numbers, and punctuation it contains. Once you’ve OCRed the picture, you may embed the text in the PDF along with the picture, whereupon you may be able to search and extract it.
But no OCR, no text, as far as computers are concerned.
Was that clear?