Carnegie Mellon University
April 16, 2014

Video Forensics

Extracting and Analyzing Information from Video about Human Rights, Conflict, and Disaster

3:00 PM- 4:00 PM
NSH 1305

Jay Aronson
Associate Professor of Science, Technology and Society, Carnegie Mellon

Since the Syrian Civil War began three years ago, Syrians have uploaded an estimated 500,000 videos to social media sites documenting the violence in their country. Participants in the Arab Spring uprisings, and other events around the world, have recorded their experiences and posted these videos online for the global community to see. The ubiquity of mobile phones with excellent cameras and internet access means that ordinary citizens, victims of human rights abuse, and participants in armed conflicts, protests, and disaster situations can now doing the same in much of the world. Repressive governments, armed militias, mafia groups and drug traffickers can view these videos and post their own as well. This set of developments creates a host of ethical, security, and political dilemmas, but it also provides human rights activists, military personnel, politicians, lawyers, academics and ordinary people with the opportunity to learn more about the on-the-ground realities of conflict and disaster zones than ever before. In order for this potential to be realized, however, numerous technical challenges need to be addressed, most fundamentally: how organize this mass of content, extract relevant information from it, analyze this information, and package conclusions into compelling, comprehensible, and actionable formats. In my presentation, I will talk about how computer vision, machine learning, and natural language processing might improve our ability to use video analytics for the advancement of humanity. Some examples include determining when two or more videos capture the same event from different perspectives; object detection (e.g., faces, weapons, military insignia, landscape); feature matching (i.e., detecting similarities in multiple videos); audio analysis (gunshots, signature sounds, speech recognition); and reconstructing events from multiple available videos using 3D modeling or other techniques.