MASTERS PROJECT: Deep Learning Approaches to Single-View Marker-Less Human Body Motion Capture
- Aleksandra Petkova
- Jun 26, 2020
- 3 min read
Updated: Apr 3
On the 18th of May 2020 I started my masters project, the largest project I have taken up so far. This project will heavily rely on peer reviewed research papers as not much else is out there on this topic at the moment. As I learn the direction of the project may change so the goal for the end of the 3 months of work is first and foremost to create a working prototype as a proof of concept, then run an evaluation phase to gather feedback on the progress, after which the project design would be re-evaluated.
What is this project really about?
Simply put I am attempting to extract game ready, 3D motion data from a single marker-less video of a human performing an action or a sequence of actions. The reason I chose this topic was because I think motion capture is very powerful and not just in the games industry but also in film, health, sports, robotics and likely the list will only keep growing. What initially spiked my interest though was the potential to reduce the development time of my personal projects. Animating a character takes a lot of effort and time, especially when it is the main character which is the one the player will interact with the most. Instead of starting from scratch my software aims to provide a good basis for the animators to get a head start in the development and push through their work quicker. This can be especially useful for indie developers as they don't necessarily have the resources to spend on expensive tech and designated motion capture studio.
What have I been up to all this time?
First two weeks I spent researching and putting together a specification [File Attachment Above], then beginning of June I started doing a literature review to get better ideas about how to approach this and what is already out there. Just a few weeks ago (15th June) I completed the backlog and at last started implementation. So I spent about 4 weeks on research prior, and would likely do more along with the implementation. As for the development methodology I decided to follow an Agile-TDD development methodology so the first task after setting up the project was to consider the core functionality to test and write those. If you'd like to follow my progress regular updates will be posted on Github. The planned timeline of the project can be found below.
Project Timeline
18th May - 14th June: Research, Project Specification, Literature Review Draft.
15th June - 29th June: Backlog, Sprint 1 (Install Tools, Create UML Class Diagrams, Build Basic CNN with TensorFlow, Load a Video and Extract its Frames)
30th June - 14th July: Sprint 2 (Further Research on Deep Learning for Computer Vision and CNN Configurations, Implement Key Point Detection, Update the Neural Network Configuration Based on Research to Improve Results)
15th July - 29th July: Sprint 3 (Update Literature Review, Depth map Generation from a Single Video, Using the Generated Depth Maps to Rise the Dimension from 2D to 3D)
30th July - 13th August: Sprint 4 (Implement Pre-Convolutional Video Processing to Take Load off the CNN)
14th August - 28th August: Sprint 5 (3rd Party Evaluation, Complete Report and Prototype)
Future Work: Revising Implementation Based on Feedback, Performance Improvements, Integrating the Software with Unreal Engine, Blender, Unity Engine as Well as Allowing a Standalone Use, Refining User Interface, Multi-Person Motion Capture.
Comments