ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
-
Updated
Mar 15, 2026 - Python
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
From data to vector database effortlessly
AzLogDcrIngestPS - Unleashing the power of Log Ingestion API with Azure LogAnalytics custom table v2, Azure Data Collection Rules and Azure Data Ingestion Pipeline
Google Cloud Storage connector, pre-processor and model for predicting user search intent based on keywords
Google Analytics connector, pre-processor and model for predicting churning users for digital publishers.
Extract Transform and Load unstructured data into the Clarifai's AI platform
This is an Elasticsearch Ingest Pipeline Processor that calls an HTTP(s) endpoint and adds the response back to the ingest document for further processing.
Real-time flight data fetching, cleaning, and analytics API using FastAPI, Pandas, PostgreSQL, and Python.
My experiments with Apache Spark for Humans ⭐
A Question Answering(Q/A) Chatbot on Insurance Documents. Powered by Retrieval Augmented Generation(RAG), LlamaIndex and LangGraph. Inspired from my Upgrad_IIITB PG Course.
Multi-disease segmentation chest X-rays by YOLO and DenseNet121, CoAtNet models
Sample Azure Data Factory pipeline for ingesting Data Packages directly from the Download API of the Ordnance Survey Data Hub into Azure Storage.
DataStax or Cassandra Ingest from Relational Databases with StreamSets
DAUT – Documentation Auto Updater - AI-powered documentation generator for your codebase. MCP-Connector
Enterprise-grade ingestion blueprint for Postgres to Databricks powered by dlt. Features dual-mode operation (Full Load + CDC Load) and robust CI/CD via Databricks Asset Bundles.
Data stack for WeLearn LPI projects. This pipeline can collect, vectorize and store data from various sources.
A Question Answering(Q/A) Chatbot on Insurance Documents. Powered by Retrieval Augmented Generation(RAG), LlamaIndex and LangChain. Inspired from my Upgrad_IIITB PG Course.
Created a data pipeline using sqoop to ingest data from sql server into the hive table and used hive for feature engineering and analysis.
AI-powered document search agent using Google ADK and Gemini — scans, reasons, and follows cross-references instead of blind retrieval.
Add a description, image, and links to the ingestion-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the ingestion-pipeline topic, visit your repo's landing page and select "manage topics."