The Robot Vision Group at the Department of Computing, Imperial College London, conducts research on real-time computer vision techniques applicable to robotics or
other demanding real-world, real-time applications. Our particular focus is on problems related to visual SLAM (Simultaneous Localisation and Mapping): how can a camera
flying through the 3D world estimate its location robustly, accurately and in real-time, and how far can it go in mapping or modelling the scene around it as it does this?
And crucially, we are interested in real-time solutions where continuous high frame-rate operation permits true prediction, active processing and interaction.
This group website is active since October 2010 (redesigned February
and contains regular news updates, as
well as links to the
publications of the group. You will no doubt find further
interesting information on group members' individual websites via the People tab you will see on the left. And definitely check out the Videos tab where you will link to
our YouTube site with many different video demos.
I have a Post-Doc/RA position available in my Robot Vision Group at Imperial College London. It's on an exciting new collaborative project which has the aim of an integrated attack on the software, compiler, runtime/operating systems and architecture challenges in manycore computer systems, driven by 3D scene understanding. Our aim is to use real-time computer vision as a way of pushing the frontiers of the practical architectures and system software at the heart of future mass-market devices --- building the foundation for future applications based on full awareness of the three-dimensional environment. The project is a collaboration with experts all the way through a vertical stack from application to architecture, led by Prof. Paul Kelly at Imperial College, Prof. Michael O'Boyle in Edinburgh and Prof. Steve Furber in Manchester (co-designer of the original ARM microprocessor).
Please see this advert for full details; forward to anyone who might be interested and get in touch with any questions.
Welcome to Hanme Kim, a new RA in the group.
We had a successful week at ECCV 2012, where Ankur Handa presented his paper on evaluating high frame-rate camera tracking to much interest. The idea of this work, in collaboration with Richard Newcombe, Adrien Angeli and Andrew Davison, was to answer questions about why we normally run advanced visual tracking algorithms in the 20--60Hz range, when we now have cameras available which can capture much faster than that; taking into account the fact that as frame-rate increases, frame-to-frame tracking becomes easier in any tracker that uses prediction (as every tracker should!). Ankur conducted systematic experiments on how the performance of a whole image alignment tracker varies (in terms of accuracy and computational cost), using a dataset he has generated of photo-realistic video. This video was generated using ray-tracing of a detailed room model, plus the application of realistic noise and blur effects using parameters determined using experiments with a real camera. Samples of our photo-realistic video are shown below. You can download the full multi-frame-rate dataset, and soon all of the open source code needed to render your own similar sequences, from the project page.
We also presented KAZE Features, a new feature detection and description algorithm based non-linear scale space decomposition which shows much improved performance than SURF and SIFT in difficult wide baseline matching problems. KAZE Features were developed by Pablo Fernandez Alcantarilla from the University of Auverge, with the collaboration of Adrien Bartoli. Pablo's source code implementing KAZE Features is available here.
KAZE Features (PDF format), Pablo Fernandez Alcantarilla, Adrien Bartoli and Andrew J. Davison, ECCV 2012.
Well done to Hauke on passing his PhD viva. His PhD thesis, available soon in its final version, gives a detailed account of optimisation-based visual SLAM, comparing this with feature-based techniques and presenting Hauke's state of the art Double Window Optimisation algorithm for scalable constant-time SLAM.
Welcome to Robert Lukierski, another new PhD student in the group.
Congratulations to Hauke on the submission of his PhD thesis. Hauke will remain in the group until September.
Jacek Zienkiewicz joined the group as a new PhD student. Welcome!
A funded PhD studentship is available in the Robot Vision Group to work on a project on 3D perception for a mobile robot in collaboration with a leading UK company. The project will build on all of our recent work on real-time monocular SLAM and reconstruction. See the full details here, and contact Andrew Davison with any queries. Note that due to the type of funding this position is only open to EU candidates, and that the closing date for applications is 20th March 2012
Andrew Comport will give a talk on Large Scale Dense Localisation and
in the Robot
on Thursday the 8th of March.
Recently, Hauke Strasdat, supported by other member of the group, released a new
visual SLAM software.
Scalable Visual SLAM (ScaViSLAM) using double window optimisation is available on github.
The Robot Vision Group website is redesigned. Check out the new Video
January 2012: Well done to Gerardo Carrera who did a great job and successfully passed his PhD viva; and good luck
him as he leaves the
lab to return to Mexico.
December 2011: Big congratulations to Steve Lovegrove who just passed his PhD viva with flying colours!
November 2011: Our group had a great ICCV in Barcelona, with a lot of interest in the papers we presented on DTAM and Double Window Optimisation (see the Publications and
Videos tabs for more info!). Also the workshop on Live Dense Reconstruction with Moving Cameras organised by Richard Newcombe, Andrew Davison and George Vogiatzis was very
successful and attended by about 200 people. Visit the website where you can now see a lot of photos from the event as well as presentation materials collected from the
October 2011: Richard Newcombe has won best paper prize at ISMAR 2011 for the paper on the KinectFusion system he developed while interning at Microsoft Research Cambridge
last summer. The paper is now available online (see our publications tab), and check out the video via the MSR project page or on youtube. This system has been highly
touted by Microsoft, heavily featured in the tech press and demoed live by various high-level people; including, incredibly, Bill Gates himself in a recent talk he gave at
the University of Washington (starting at about 12 minutes into this video).
October 2011: Some updates: well done to Steve and Gerardo who have both just submitted their PhD theses and are awaiting their vivas; Renato just returned from a very
interesting summer internship with his sponsors AMD in Austin; and we welcome new PhD student Jan Jachnik, a recent graduate from Imperial's Mathematics Department.
September 2011: We just said farewell to Adrien Angeli who left the lab to return to Corsica and work in a start-up company there after three years as a post-doc in the
group. We marked the sad occasion of his leaving with an excellent Korean meal and a rapidly-becoming-a-tradition karaoke session. We'll miss him and wish him the best of
September 2011: Ankur Handa, with the close support of Richard Newcombe and Adrien Angeli, has been investigating the role that the Legendre-Fenchel Transform can play in
optimisation problems in computer vision (for example in the depth map construction component of our new DTAM system). It gives a standard procedure for setting up and
solving a wide range of convex optimisation problems. He has written a tutorial-style technical report on this topic which we hope will be of interest to other researchers
looking into this area. See Ankur's homepage for more information.
Applications of the Legendre-Fenchel Transform (PDF format),
Ankur Handa, Richard A. Newcombe, Adrien Angeli and Andrew J. Davison, Imperial College Department of Computing Technical Report DTR11-7, 2011.
July 2011: Richard Newcombe, Andrew Davison and George Vogiatzis (Aston University) are organising a workshop on "Live Dense Reconstruction using Moving Cameras" to be held
at ICCV in November this year. See the workshop website, and please consider submitting an extended abstract describing a demonstration or poster in this area.
March 2011: Gerardo Carrera and Adrien Angeli will go to ICRA in Shanghai in May to present our work on automatic calibration of a multi-camera rig. We show that a rig of
two or more cameras mounted on a mobile robot can have their relative extrinsic locations automatically calibrated, even if they have no overlap in their fields of view.
The robot makes a movement pattern including a full rotation, and each camera builds its own monocular SLAM map. These maps are then matched, fused and jointly optimised,
imposing the rigidity of the whole rig, to estimate the camera configuration. We present results for both two and four cameras.
September 2010: At ECCV Steven Lovegrove presented our paper on live spherical mosaicing and also gave a live demo of the system. In this work, we show how to make a
mosaicing system which operates sequentially in real-time but also generates globally consistent mosaics over a whole sphere. As video is gathered live from a purely
rotating camera held in the hand or on a tripod, two parallel processing threads operate: one tracks the camera pose at frame-rate relative to the current mosaic, stored as
a set of registered keyframes, while the second repeatedly globally optimises the keyframe poses and also refines an estimate of camera intrinsics. Both processes are based
on whole image alignment on a pyramid rather than feature matching, and Steve's implementation makes heavy use of GPU acceleration throughout the pipeline. Click on the
image below for an example high resolution mosaic.
September 2010: Adrien Angeli presented our paper at BMVC on live feature clustering. This is work in the direction of understanding how appearance and geometry information
can be used in a more unified way in visual SLAM than in current systems where place recognition and loop closure detection is done with bag-of-words type approaches which
discard geometry when looking up features which represent a place. Can we embed appearance information in the 3D world model in a more integrated way? Here we show a step
towards that, be demonstrating that we can cluster the 3D features obtained from visual SLAM into meaningful clusters in real-time, where cluster membership depends both on
appearance similarity and geometrical proximity. These clusters may or may not correspond well to objects in the scene, but certainly represent repeatable structure which
should have good properties of viewpoint-invariance and we plan to move on to using them for flexible and efficient place recognition, as well as potentially investigating
their use for semantic labelling.