Logo
Sign in

Apache PyLucene is a Python extension that enables access to Java Lucene’s full-text indexing and search capabilities. It embeds a Java Virtual Machine into a Python process, allowing Python applications to leverage Lucene’s powerful search features through a machine-generated Python module.

Vendor

Vendor

The Apache Software Foundation

Company Website

Company Website

lucene_og_image.png
Product details

Apache PyLucene

Apache PyLucene is a Python extension that provides access to the full-text indexing and search capabilities of Java Lucene. Rather than being a port, PyLucene embeds a Java Virtual Machine (JVM) into a Python process, allowing Python applications to interact directly with Lucene’s powerful search engine features. It is built using JCC, a C++ code generator that bridges Python and Java via the Java Native Invocation Interface (JNI).

Features

  • Full API compatibility with Java Lucene (currently tracking Lucene 10.0.0)
  • Python module lucene generated by JCC for seamless integration
  • Support for Lucene contrib packages including Snowball stemmers, highlighters, and specialized analyzers
  • Pythonic extensions for iteration, property access, and mapping protocols
  • Exception handling via JavaError for Java-Python boundary errors
  • Java array support through JArray wrappers
  • Extension points for customizing analyzers and tokenizers from Python
  • Threading support via attachCurrentThread for multi-threaded applications

Capabilities

  • Enables Python applications to perform advanced text indexing and search
  • Supports creation and manipulation of Lucene indexes from Python
  • Allows extension of Lucene classes directly in Python
  • Facilitates integration with existing Python-based data processing pipelines
  • Provides access to Lucene’s scoring, query parsing, and document handling features
  • Compatible with macOS, Linux, Solaris, and Windows

Benefits

  • Combines the performance and robustness of Lucene with Python’s simplicity
  • Reduces development time for search-enabled Python applications
  • Enhances flexibility through Pythonic API enhancements
  • Promotes reuse of Lucene’s mature search infrastructure in Python environments
  • Supports customization and extension without modifying Java code
  • Backed by the Apache Software Foundation for long-term stability