dbCI: An annotated video database of classroom interactions

This is a database of annotated videos of student-to-student discussions in an interactive STEM classroom. The video was recorded by a six-camera array in a university lecture hall. The database serves as a resource for developing and evaluating computer vision techniques for detecting, recognizing, and analyzing social interactions.

What makes the classroom "interactive"?

It is interactive because students frequently engage in ad-hoc peer-to-peer discussions, guided by a framework known as peer instruction. Each discussion period starts with the instructor posing a question that has been carefully designed to highlight common misconceptions. The students are given a moment to formulate and electronically-commit their individual answers, and then they are asked to discuss their thinking and answers with the people around them. After a few minutes of discussion in ad-hoc groups, each student electronically-commits again to an individual answer. Based on the results, the instructor decides whether more explanation is needed before moving on to the next concept.

What does the video look like?

The cameras recorded for the duration of all lectures, and the raw video from a one-term course totaled more than 200 camera-hours. We extracted from this a set of video clips that correspond to the discussion periods, producing about 300 video clips in total (six cameras times about fifty discussion periods). Each video clip is between one and five minutes long, and each contains observations of a handful of small-group discussions. The spatial resolution of each clip is 800x600; the temporal resolution is 14fps; and there is no audio. Example frames are in our CVPR 2013 paper.

What are the annotations?

The video clips in the database are associated with a variety of annotations:

Face locations and identifying labels. Manually-verified bounding boxes around the faces, each labeled with an anonymous student identifier.
Coarse head pose. Machine-estimates of the head poses in every frame, quantized to one of nine directions.
Torso locations. Manually-verified bounding boxes around each torso below a face.
Dense optical flow. Full-resolution optical flow from Brox and Malik's algorithm (PAMI 2011).
Question responses. The question associated with each PI session, along with the per-student answers that were electronically-committed before and after the discussion period.
Interaction labels. For a growing subset of the data, manual per-frame annotations of student-to-student interactions, including where they are within the frame, when they begin and end, and whether they are "on-topic" or "off-topic". (These annotations are made by human experts who watch the raw video in sync with audio from a separate microphone array.)
Redaction labels. Pixels corresponding to a small number of students (approximately 5%) have been redacted because they elected not to participate in the study. Redacted pixels have been set to zero intensity, and a list of pixels that have been redacted is stored separately.
Publication labels. The majority of students consented to their images being published in academic papers and talks. For those that did not, there is a list of corresponding pixels that cannot appear in papers or talks. Examples are in this CVPR 2013 paper.

How can I work with the database?

The database lives on protected servers, and all of its analysis is conducted on those servers. Access to these servers requires two things: 1) training in the ethics of human subjects research (a short, free online training course through Collaborative Institutional Training Initiative); and 2) pre-approval by the project leaders and Harvard's institutional review board. For more information, contact Todd Zickler.

Publications that use the database

Ruonan Li, Parker Porfilio, and Todd Zickler, "Finding Group Interactions in Social Clutter", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.

Ruonan Li, Parker Porfilio, and Todd Zickler, "Finding Group Interactions in Social Clutter", Technical Report, Computer Science Group, Harvard University TR-01-13, 2013.

Credits

This database was created by Laura Tucker with help from Ruonan Li, Brian Lukoff, Parker Porfilio, Ely Spears, John Brunelle, Rachel Scherr, Eric Mazur, Todd Zickler, and many others.

This resource was created with support from the US National Science Foundation