Software


NoticIA
LLM finetuning and evaluation library for the NoticIA dataset of 850 Spanish news with human-written summaries.

Clickbait Fighter
AI that generates one-sentence summaries of clickbait news articles. Trained on 8×A100; deployed with vLLM and Ray.

GoLLIE
Guideline-following LLM for Information Extraction; supports zero-shot schemas defined on the fly.

T-Projection
High-quality annotation projection for sequence labeling datasets, built on Transformers + Accelerate.

Sequence Labeling with LLMs
Sequence Labelling with LLMs via Text2Text constrained generation built on Transformers + Accelerate.

Easy-Translate
Translate large text files with a single command. Easy for beginners, customizable for power users.

Easy Label Projection
Project labels across datasets using mGiza, FastAlign, SimAlign or AWESOME to generate resources for low-resource languages.

Context-enriched multilingual NER using knowledge bases
Candidate generation + KB linking + fine-grained classification using retrieved knowledge.

MetaVec
Monolingual and cross-lingual meta-embedding generation and evaluation framework.

Self Driving Car in Video Games
Supervised deep network that learns to drive in GTA V from human-labelled data.