Back End Engineer (DataEng and Web Scraping)

About Chainlabs

Chainlabs is a data startup focused on de-anonymizing the blockchain. We are the largest data labels broker in the web3 space. We make web3 safer by identifying blockchain addresses linked to cybercrime activities.

Our primary clients include web3 analytics companies such as Coinbase and Nansen, as well as law enforcement, traders, and academic researchers.

Position Overview

We are seeking a Senior Back End Engineer to develop key data pipelines, including web scrapers, ETL workflows, and AI agents. Your work will enable us to extract critical data from clearnet and darknet platforms and power our blockchain intelligence solutions.

In this role, you’ll design high-performance systems that move data seamlessly into PostgreSQL and BigQuery environments. By tackling complex challenges in web scraping and data engineering, you’ll help us collect, organize, and analyze the web3 ecosystem at scale.

Key Responsibilities

Web Scraping:
- Design and deploy web scrapers to extract data from complex websites like Etherscan, Arkham, ChainAbuse, etc.
- Implement data collection mechanisms with support for IP rotation, CAPTCHA handling, and dynamic content rendering.
- Help us build AI-Based Crawlers and Scrapers to find hidden web3 wallets on the Darknet.
Data Engineering:
- Build and maintain ETL pipelines to ingest scraped data into PostgreSQL or Bigquery.
- Orchestrate AI Agents Scrapers
- Classify the collected web3 wallets into predefined categories (e.g., Swap, Bridge, Scam, Ransomware) according to our taxonomy definitions.
API Maintenance:
- Maintain and enhance Chainlabs API endpoints to provide seamless access to the data.
- Utilize Python frameworks (e.g., Flask) to support and extend RESTful APIs.
Data Analytics:
- Use SQL to analyze on-chain data, ie, the behavior of collected wallets, their interaction with other entities, and other on-chain trends

Requirements

Proficiency in Python for data engineering, web scraping, API development; and databases like PostgreSQL.
Experience with frameworks such as Scrapy, Selenium and BeautifulSoup.
Understanding of advanced Web Scraping techniques, including IP rotation, proxy management, and CAPTCHA handling.
Familiarity with modern AI models, APIs, and SDKs (e.g. OpenAI, LLMs).
Hands-on experience with ETL orchestration tools like Apache Airflow
Strong problem-solving and not afraid to work with a wide variety of data.
Ability to work independently and meet deadlines in a remote environment.