Below I describe a selection of my recent projects, grouped by topic.
End-User Programming for Web Automation
Helena is a high-level programming language for web automation tasks such as data collection and data entry. Users draft Helena programs by recording themselves completing a subtask; Helena's Programming By Demonstration (PBD) tools can use these recordings to write programs for completing all subtasks. (See below.) The Helena editor lets users adapt, extend, and understand their programs.
Ringer is a low-level programming language for web automation tasks. Many statements in the high-level Helena language are implemented with Ringer. Ringer comes with a record and replay tool; when a user demonstrates how to complete an interaction in a normal browser, the tool writes a straight-line Ringer program that completes the same interaction on the same pages.
Rousillon is a PBD tool for writing Helena programs that automate large-scale web data collection tasks. It is implemented as a Chrome extension. Rousillon lets users demonstrate how to collect the first row of a multi-relational dataset, then generalizes the straight-line interaction into a program for collecting all rows.
DIYDA (the DIY Digital Assistant) is a tool for adding custom 'skills' to a voice assistant. To add new skills, users (i) provide the text of the question they'll ask, (ii) demonstrate how to find the answer to the question on the web, then (iii) tell Diyda what to say aloud to answer the question. Based on the user's demonstration, Diyda writes a Helena program to automate the web interaction. This project is still in development, but check back for updates!
End-User Programming for Other Domains
Dapper synthesizes Probabilistic Programming Language (PPL) models from input datasets. The goal is to put small, readable PPL models in the hands of social scientists and data scientists, so that they have access to the powerful abstractions PPLs offer for inference and simulated interventions. Dapper currently produces output programs in the BLOG language, but its IR makes it easily retargetable.
Driver uses Syntax-Guided Synthesis (SyGuS) to synthesize reactive robot motion planners. As input, it takes a simple description of the environment, the target, the obstacles, how the obstacles can move, and how the robot itself can move. As output, it produces a motion planner that can react to an adversarial environment.
Usable Parallelization
Dicer is a framework for parallelizing large web automation experiments. As more tools start consuming webpages as inputs, the need for controlled experiments of these tools grows greater. Dicer facilitates such experiments by (i) using a custom caching proxy server to hold webpage inputs constant during a programmer-defined session and (ii) offering a simple programming model that allows Dicer to automatically and transparently parallelize experiments.
The Parallel Skip Block, a new language construct introduced in Helena, offers end users a way to parallelize their web automation scripts by answering questions they already understand. In particular, users indicate how they decide whether two online objects - e.g. authors, restaurants - represent the same entity. Our results show that people who identify as non-programmers can learn to use this construct correctly in seven minutes, on average. Once a user has added a skip block, Helena can use it to automatically parallelize the program, achieving near-ideal speedups.