QuackTech Innovation
What to investigate, which evidence to trust, where to allocate the next experiment: the hardest research decisions stay implicit. We build the systems that make them explicit.
Capabilities
Software that carries the criteria scientists apply at each stage of research, from ideation and literature synthesis to evaluation, planning, and experimental execution.
We capture how a scientist evaluates ideas, evidence, and directions, then encode it as a framework you can audit, share, and re-run on new inputs.
Discovery, hypothesis, evaluation, planning, execution. Calibrated judgment models sit at each stage as directional filters, keeping the researcher in the driver's seat.
Reward models freeze the day they ship; bibliometric proxies lag by years. Our criteria are re-fitted after every round of human feedback, so they track the field as it moves.
Selected Work
Three recurring shapes our judgment systems take in practice. Client-specific instances are redacted; the patterns below are what we keep reaching for across domains.
Pattern 01
An evaluation function trained from rounds of expert feedback. Each disagreement is one data point; over time the system's scores approach the scientist's, and the remaining gaps mark the dimensions where intuition is still doing the work.
Pattern 02
A structured card format for research problems: the claim, the supporting evidence, rubric scores, reviewer commentary, and iteration history held in one file. Judgment moves from meeting notes into a versioned record any reviewer can revisit.
Pattern 03
A rubric that splits the single verdict into independent dimensions. Human and system scores line up per axis, making it obvious which dimension each disagreement sits on, and which one deserves the next round of calibration.
Methodology
Research decisions that matter most are the ones scientists cannot fully articulate. We make those decisions legible in three steps, then put them back in the loop.
01
Sit with the domain scientist. Turn the criteria, heuristics, and gut checks they apply in practice into an explicit, testable draft.
02
Run the draft on real candidates; every disagreement with the expert becomes a training signal. Accuracy improves, and the remaining gaps point at what we missed.
03
Scientific standards shift as the field produces new evidence. Feedback keeps coming in; the rubric is re-fitted on a schedule, so the system reflects current practice instead of last year's.
Impact
Team
Founder & Chief Scientist
A decade of shipping production ML at scale. The current focus: can the intuition a working scientist applies day-to-day be written down, calibrated against their own feedback, and run as software?
QuackTech is currently embedded with experimental physics groups at Yale, co-developing judgment systems for research taste, hypothesis evaluation, and problem selection.
Prior Staff Applied Scientist, Samsara · Staff ML Engineer, Pinterest · Senior SDE, Microsoft · PhD, Columbia University