Skip to content

pycottas

pycottas is a library for working with compressed RDF files in the COTTAS format. COTTAS stores triples in a triple table in the Apache Parquet format. It is built on top of DuckDB and provides an HDT-like interface.

Features

  • Compression and decompression of RDF files.
  • Querying COTTAS files with triple patterns.
  • RDFLib store backend for querying COTTAS files with SPARQL.
  • Supports RDF datasets (quads).
  • Can be used as a library or via command line.

COTTAS Files

COTTAS is based on COlumnar Triple TAble Storage with the Apache Parquet file format. A COTTAS file consists on a table with s, p, o, g columns representing triples (and named graphs):

Licenses

pycottas is available under the Apache License 2.0.

The documentation is licensed under CC BY-SA 4.0.

Author

Citing

If you used pycottas in your work, please cite the ISWC paper:

@inproceedings{arenas2026cottas,
  title     = {{COTTAS: Columnar Triple Table Storage for Efficient and Compressed RDF Management}},
  author    = {Arenas-Guerrero, Julián and Ferrada, Sebastián},
  booktitle = {Proceedings of the 24th International Semantic Web Conference},
  year      = {2026},
  publisher = {Springer Nature Switzerland},
  isbn      = {978-3-032-09530-5},
  pages     = {313--331},
  doi       = {10.1007/978-3-032-09530-5_18},
}

OEG UPM