Class DocumentSearchControllerImpl

  • All Implemented Interfaces:
    DocumentSearchController

    public class DocumentSearchControllerImpl
    extends Object
    implements DocumentSearchController
    Document search controller used to manage document searches. This class class takes care of many of the performance issues of doing searches on larges documents and is also used by PageViewComponentImpl to highlight search results.
    This implementation uses simple search algorithm that will work well for most users. This class can be extended and the method searchHighlightPage(int) can be overridden for custom search implementations.
    The DocumentSearchControllerImpl can be constructed to be used with the Viewer RI source code via the constructor that takes a Controller as a parameter. The second variation is ended for a headless environment where Swing is not needed, the constructor for this instance takes a Document as a parameter.
    Since:
    4.0
    • Constructor Detail

      • DocumentSearchControllerImpl

        public DocumentSearchControllerImpl​(SwingController viewerController)
        Create a news instance of search controller. A search model is created for this instance.
        Parameters:
        viewerController - parent controller/mediator.
      • DocumentSearchControllerImpl

        public DocumentSearchControllerImpl​(Document document)
        Create a news instance of search controller intended to be used in a headless environment. A search model is created for this instance.
        Parameters:
        document - document to search.
    • Method Detail

      • searchHighlightPage

        public int searchHighlightPage​(int pageIndex,
                                       String term,
                                       boolean caseSensitive,
                                       boolean wholeWord)
        Searches the given page using the specified term and properties. The search model is updated to store the pages Page text as a weak reference which can be queried using isSearchHighlightNeeded to efficiently make sure that a pages text is highlighted even after a dispose/init cycle. If the text state is no longer present then the search should be executed again.
        This method clears the search results for the page before it searches. If you wish to have cumulative search results then searches terms should be added with addSearchTerm(String, boolean, boolean) and the method searchPage(int) should be called after each term is added or after all have been added.
        Specified by:
        searchHighlightPage in interface DocumentSearchController
        Parameters:
        pageIndex - page to search
        caseSensitive - if true use case sensitive searches
        wholeWord - if true use whole word searches
        term - term to search for
        Returns:
        number for hits for this page.
      • searchHighlightPage

        public int searchHighlightPage​(int pageIndex)
        Searches the page index given the search terms that have been added with addSearchTerm(String, boolean, boolean). If search hits where detected then the Page's PageText is added to the cache.
        This method represent the org.icepdf.core search algorithm for this DocumentSearchController implementation. This method can be overriden if a different search algorithm or functionality is needed.
        Specified by:
        searchHighlightPage in interface DocumentSearchController
        Parameters:
        pageIndex - page index to search
        Returns:
        number of hits found for this page.
      • searchHighlightPage

        public List<LineText> searchHighlightPage​(int pageIndex,
                                                  int wordPadding)
        Searches the page index given the search terms that have been added with addSearchTerm(String, boolean, boolean). If search hits where detected then the Page's PageText is added to the cache.
        This class differences from searchHighlightPage(int) in that is returns a list of lineText fragments for each hit but the LinText is padded by pre- and post-words that surround the hit in the page context.
        This method represent the org.icepdf.core search algorithm for this DocumentSearchController implementation. This method can be overriden if a different search algorithm or functionality is needed.
        Specified by:
        searchHighlightPage in interface DocumentSearchController
        Parameters:
        pageIndex - page index to search
        wordPadding - word padding on either side of hit to give context to found words in the returned LineText
        Returns:
        list of contextual hits for the give page. If no hits an empty list is returned.
      • searchPage

        public ArrayList<WordText> searchPage​(int pageIndex)
        Search page but only return words that are hits. Highlighting is still applied but this method can be used if other data needs to be extracted from the found words.
        Specified by:
        searchPage in interface DocumentSearchController
        Parameters:
        pageIndex - page to search
        Returns:
        list of words that match the term and search properties.
      • showWord

        public void showWord​(int pageIndex,
                             WordText word)
        Navigate tot he page that the current word is on.
        Specified by:
        showWord in interface DocumentSearchController
        Parameters:
        pageIndex - page number to navigate to
        word - word that has been marked as a cursor.
      • addSearchTerm

        public SearchTerm addSearchTerm​(String term,
                                        boolean caseSensitive,
                                        boolean wholeWord)
        Add the search term to the list of search terms. The term is split into words based on white space and punctuation if the search mode is WORD. No checks are done for duplication.
        A new search needs to be executed for this change to take place.
        Specified by:
        addSearchTerm in interface DocumentSearchController
        Parameters:
        term - single word or phrase to search for.
        caseSensitive - is search case sensitive.
        wholeWord - is search whole word sensitive.
        Returns:
        searchTerm newly create search term.
      • removeSearchTerm

        public void removeSearchTerm​(SearchTerm searchTerm)
        Removes the specified search term from the search. A new search needs to be executed for this change to take place.
        Specified by:
        removeSearchTerm in interface DocumentSearchController
        Parameters:
        searchTerm - search term to remove.
      • clearSearchHighlight

        public void clearSearchHighlight​(int pageIndex)
        Clear all searched items for specified page.
        Specified by:
        clearSearchHighlight in interface DocumentSearchController
        Parameters:
        pageIndex - page indext to clear
      • clearAllSearchHighlight

        public void clearAllSearchHighlight()
        Clears all highlighted text states for this document. This optimized to use the SearchHighlightModel to only clear pages that still have selected states.
        Specified by:
        clearAllSearchHighlight in interface DocumentSearchController
      • isSearchHighlightRefreshNeeded

        public boolean isSearchHighlightRefreshNeeded​(int pageIndex,
                                                      PageText pageText)
        Test to see if a search highlight is needed. This is done by first check if there is a hit for this page and if the PageText object is the same as the one specified as a param. If they are not the same PageText object then we need to do refresh as the page was disposed and reinitialized with new content.
        Specified by:
        isSearchHighlightRefreshNeeded in interface DocumentSearchController
        Parameters:
        pageIndex - page index to text for results.
        pageText - current pageText object associated with the pageIndex.
        Returns:
        true if refresh is needed, false otherwise.
      • getPageText

        protected PageText getPageText​(int pageIndex)
        Gets teh page text for the given page index.
        Parameters:
        pageIndex - page index of page to extract text.
        Returns:
        page's page text, can be null.
      • searchPhraseParser

        protected ArrayList<String> searchPhraseParser​(String phrase)
        Utility for breaking the pattern up into searchable words. Breaks are done on white spaces and punctuation.
        Parameters:
        phrase - pattern to search words for.
        Returns:
        list of words that make up phrase, words, spaces, punctuation.
      • getComponentsFor

        public Set<SearchHitComponent> getComponentsFor​(int pageIndex)
        Returns the components created by the search for a given page
        Parameters:
        pageIndex - The index of the page
        Returns:
        The set of components
      • addComponent

        protected void addComponent​(int pageIndex,
                                    String text,
                                    Rectangle2D.Double bounds)
        Adds a component with the given arguments.
        Parameters:
        pageIndex - The page to add the component to
        text - The text of the component
        bounds - The bounds of the component