Data Engineer - Real-Time Systems
Division: DATUM, Impac Exploration Services
Location: Remote, Oklahoma City (OK), Houston (TX), San Jose (CA)
Type: Full-Time
We're done with the "load it tonight, analyze it tomorrow" paradigm. At DATUM, decisions happen in milliseconds, not morning reports. We need a data engineer who believes streaming is the default, not the exception—someone who gets genuinely frustrated when people suggest "just run it as a nightly job."
Our data doesn't wait for convenient processing windows. It flows from sensors, cameras, and systems that never sleep. Your infrastructure will catch it, process it, and serve insights before traditional pipelines even know it arrived. If you think Kafka is table stakes and real-time inference is the only kind worth doing, we should talk.
What You'll Build
•Streaming pipelines that handle millions of events per second
•Infrastructure for real-time ML inference at the edge and core
•Systems that treat historical data as streaming replay, not static files
•Data architectures that scale
•Fault-tolerant pipelines that keep flowing when hardware fails
Your Philosophy
•The best data lake is a flowing river
•Every millisecond of latency is a missed opportunity
•Static ETL is where good data goes to get stale
•If it's not real-time, it's not real
Technical Reality
Core streaming stack:
•Apache Kafka/Pulsar/Redpanda (or better alternatives you'll introduce)
•Flink/Spark Streaming for complex event processing
•Time-series databases that can actually keep up (TimescaleDB, InfluxDB, or custom)
•Languages: Python/Java/Rust—whatever makes it fast
•Container orchestration without cloud vendor lock-in
What you won't use:
•Traditional ETL tools that think "streaming" means every 5 minutes
•Cloud services that hold your data hostage
•Architectures that fall over when AWS hiccups
You're Our Person If
•You've built streaming systems that stayed up when it mattered
•"Eventually consistent" makes you uncomfortable
•Real-time inference excites you more than data warehousing
Especially If
•You've built on-premise streaming infrastructure that rivals cloud offerings
•You've done inference at the edge before edge was cool
•You understand hardware—from NVMe optimization to network tuning
•You've migrated from batch to streaming and never looked back
•You can make time-series data sing at scale
•You believe data gravity is a solvable problem
Why This Matters
Your pipelines will power:
•ML models making decisions while drill bits are turning
•Computer vision processing streams from harsh environments
•Analytics that prevent problems rather than explaining them later
•Systems where "historical analysis" means 30 seconds ago
This isn't building dashboards for quarterly reviews. This is infrastructure for decisions that can't wait.
Growth Path
Today: Building streaming pipelines that embarrass traditional ETL Six months: Architecting systems that make cloud vendors nervous. One year: Publishing approaches that redefine industrial data processing.
When Databricks or Confluent tries to hire you, it'll be because you built something better than what they're selling.
Reality Check
You'll fight against decades of batch processing mindset. You'll optimize systems down to microseconds. You'll build infrastructure in places with challenging connectivity. You'll explain why "real-time" isn't just a buzzword.
But you'll also enable genuinely new capabilities. You'll prove that industrial systems can be as responsive as trading platforms. You'll build the foundation for AI that reacts as fast as physics demands.
Ready to Stream?
Show us streaming systems you've built that others said were impossible. Tell us why you believe batch processing is (mostly) dead. Share your vision for data infrastructure unchained from cloud providers.
We're looking for someone who sees "process nightly" and thinks "why wait?"