Class EpubContentParser

java.lang.Object
org.apache.tika.parser.epub.EpubContentParser
All Implemented Interfaces:
Serializable, org.apache.tika.parser.Parser

public class EpubContentParser extends Object implements org.apache.tika.parser.Parser
Parser for EPUB OPS *.html files.

For the time being, assume XHTML (TODO: DTBook)

See Also:
  • Constructor Details

    • EpubContentParser

      public EpubContentParser()
  • Method Details

    • getSupportedTypes

      public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context)
      Specified by:
      getSupportedTypes in interface org.apache.tika.parser.Parser
    • parse

      public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException, org.apache.tika.exception.TikaException
      Specified by:
      parse in interface org.apache.tika.parser.Parser
      Throws:
      IOException
      SAXException
      org.apache.tika.exception.TikaException