Jekyll2019-09-28T08:42:25-07:00https://shreyasskandan.github.io/feed.xmlShreyas SkandanPhD Student at the University of PennsylvaniaShreyas S. Shivakumarsshreyas@seas.upenn.eduDFuseNet: Deep Fusion of RGB and Sparse Depth Information2019-04-18T00:00:00-07:002019-04-18T00:00:00-07:00https://shreyasskandan.github.io/posts/dfusenet<p>Code for our paper <em>“DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion”</em> is now on GitHub -
<a href="https://github.com/ShreyasSkandanS/DFuseNet">CODE</a></p>
<p>The ARXIV paper can be found <a href="https://arxiv.org/pdf/1902.00761.pdf">here</a>.</p>
<p><img src="/images/dfusenet_kitti.png" alt="DFuseNetKitti" /></p>
<p>We also present a small dataset of calibrated RGB and LiDAR data from a short drive around Philadelphia as an additional test set. This dataset can be found <a href="https://github.com/ShreyasSkandanS/DFuseNet">here</a>.</p>
<h3 id="abstract">Abstract:</h3>
<p>In this paper we propose a convolutional neural network that is designed to upsample a series of sparse range measurements based on the contextual cues gleaned from a high resolution intensity image. Our approach draws inspiration from related work on super-resolution and in-painting. We propose a novel architecture that seeks to pull contextual cues separately from the intensity image and the depth features and then fuse them later in the network. We argue that this approach effectively exploits the relationship between the two modalities and produces accurate results while respecting salient image structures. We present experimental results to demonstrate that our approach is comparable with state of the art methods and generalizes well across multiple datasets.</p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduCode for our paper “DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth Completion” is now on GitHub - CODEDARPA Subterranean Challenge Integrated Exercise2019-04-07T00:00:00-07:002019-04-07T00:00:00-07:00https://shreyasskandan.github.io/posts/darpastix<p>Our team (PLUTO) just got back from a successful run at the DARPA STIX in Colorado.</p>
<p><img src="/images/stixteam.jpg" alt="TEAM" /></p>
<p>We took platforms from both Ghost Robotics and Exyn Technologies:</p>
<p><strong>Ghost Robotics Vision Platform:</strong>
<img src="/images/ghost_1.png" alt="GhostRobotics1" /></p>
<p><strong>Ghost Robotics Vision Platform:</strong>
<img src="/images/ghost_2.png" alt="GhostRobotics2" /></p>
<p><strong>Ghost Robotics Vision Platform and Exyn Aerial Platform:</strong>
<img src="/images/stix_platforms.png" alt="RobotTeam" /></p>
<p>To keep track of our team and our progress through this challenge, watch our <a href="https://pluto-subt.github.io/index.html">website</a>.</p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduOur team (PLUTO) just got back from a successful run at the DARPA STIX in Colorado.ICRA 2019 Accepted Papers2019-02-21T00:00:00-08:002019-02-21T00:00:00-08:00https://shreyasskandan.github.io/posts/icra_accepted<p>Two papers that I was a part of:</p>
<h2 id="1-the-open-vision-computer-an-integrated-sensing-and-compute-system-for-mobile-robots">1. The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots</h2>
<p><a href="https://arxiv.org/abs/1809.07674">Paper (arXiv)</a></p>
<h4 id="abstract">Abstract</h4>
<p>In this paper we describe the Open Vision Computer (OVC) which was designed to support high speed, vision guided autonomous drone flight. In particular our aim was to develop a system that would be suitable for relatively small-scale flying platforms where size, weight, power consumption and computational performance were all important considerations. This manuscript describes the primary features of our OVC system and explains how they are used to support fully autonomous indoor and outdoor exploration and navigation operations on our Falcon 250 quadrotor platform.</p>
<p><a href="https://youtube.com/watch?v=dMxgNf8cXkI"><img src="https://img.youtube.com/vi/dMxgNf8cXkI/0.jpg" alt="OVC" /></a></p>
<h2 id="2-real-time-dense-depth-estimation-by-fusing-stereo-with-sparse-depth-measurements">2. Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements</h2>
<p><a href="https://arxiv.org/abs/1809.07677">Paper (arXiv)</a></p>
<h4 id="abstract-1">Abstract</h4>
<p>We present an approach to depth estimation that fuses information from a stereo pair with sparse range measurements derived from a LIDAR sensor or a range camera. The goal of this work is to exploit the complementary strengths of the two sensor modalities, the accurate but sparse range measurements and the ambiguous but dense stereo information. These two sources are effectively and efficiently fused by combining ideas from anisotropic diffusion and semi-global matching.</p>
<p>We evaluate our approach on the KITTI 2015 and Middlebury 2014 datasets, using randomly sampled ground truth range measurements as our sparse depth input. We achieve significant performance improvements with a small fraction of range measurements on both datasets. We also provide qualitative results from our platform using the PMDTec Monstar sensor. Our entire pipeline runs on an NVIDIA TX-2 platform at 5Hz on 1280x1024 stereo images with 128 disparity levels.</p>
<p><a href="https://youtube.com/watch?v=p_jCRGMqE7Y"><img src="https://img.youtube.com/vi/p_jCRGMqE7Y/0.jpg" alt="StereoDepth" /></a></p>
<p>Have been accepted to the “IEEE International Conference on Robotics and Automation 2019” (ICRA 2019). I will be attending to present this work, so feel free to reach out to me if you’re there.</p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduTwo papers that I was a part of:Stereo and Sparse Depth Fusion2018-11-21T00:00:00-08:002018-11-21T00:00:00-08:00https://shreyasskandan.github.io/posts/stereofusion<p>Code for our paper <em>“Real Time Dense Depth Estimation by Fusing Stereo with
Sparse Depth Measurements”</em> is now on GitHub -
<a href="https://github.com/ShreyasSkandanS/stereo_sparse_depth_fusion">CODE</a></p>
<p>The ARXIV paper can be found <a href="https://arxiv.org/pdf/1809.07677.pdf">here</a>.</p>
<h3 id="abstract">Abstract:</h3>
<p>We present an approach to depth estimation that fuses information from a stereo pair with sparse range measurements derived from a LIDAR sensor or a range camera. The goal of this work is to exploit the complementary strengths of the two sensor modalities, the accurate but sparse range measurements and the ambiguous but dense stereo information. These two sources are effectively and efficiently fused by combining ideas from anisotropic diffusion and semi-global matching.</p>
<p>We evaluate our approach on the KITTI 2015 and Middlebury 2014 datasets, using randomly sampled ground truth range measurements as our sparse depth input. We achieve significant performance improvements with a small fraction of range measurements on both datasets. We also provide qualitative results from our platform using the PMDTec Monstar sensor. Our entire pipeline runs on an NVIDIA TX-2 platform at 5Hz on 1280x1024 stereo images with 128 disparity levels.</p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduCode for our paper “Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements” is now on GitHub - CODEGTC DC 2018 - Jetson AGX Xavier Developer Day2018-10-24T00:00:00-07:002018-10-24T00:00:00-07:00https://shreyasskandan.github.io/posts/gtcdc<p>I was invited to speak at the NVIDIA GPU Technology Conference for at their
NVIDIA Jetson AGX Xavier Developer Day. Here is a video from the talk. The talk
is mainly centered around the use of the NVIDIA Jetson platform on our
quadrotors and some information regarding the autonomous UAV software stack that was designed
at our lab.</p>
<p><a href="https://www.youtube.com/watch?v=FLunb5Y-USI"><img src="https://img.youtube.com/vi/FLunb5Y-USI/0.jpg" alt="GTCDC2018" /></a></p>
<p>Speaker List:
<img src="/images/gtcdc.png" alt="gtcdc" /></p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduI was invited to speak at the NVIDIA GPU Technology Conference for at their NVIDIA Jetson AGX Xavier Developer Day. Here is a video from the talk. The talk is mainly centered around the use of the NVIDIA Jetson platform on our quadrotors and some information regarding the autonomous UAV software stack that was designed at our lab.Pytorch Scribbles2018-10-06T00:00:00-07:002018-10-06T00:00:00-07:00https://shreyasskandan.github.io/posts/pytorchtips<h1 id="pytorch-scribble-pad">PyTorch Scribble Pad</h1>
<p>This page is a collection of notes and tips for myself in getting familiar with
the workings of PyTorch.</p>
<p><strong>1. Transfering Weights</strong></p>
<p>If you have a pretrained network A with some layers A:{x,y,z} and you have a new
network architecture with some layers B:{w,x,y,z,a}, and you wish to transfer
weights learned from network A for layers {x,y,z} to B, you can do it using the
following:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pretrained_model_weights = torch.load('../path/model.pth')
new_model_weights = model.state_dict()
pretrained_model_weights = {k: v for k, v in pretrained_model_weights.items() if k in new_model_weights}
new_model_weights.update(pretrained_model_weights)
model.load_state_dict(new_model_weights)
</code></pre></div></div>
<p><strong>2. Transferring Weights and Distributed Training</strong></p>
<p>I use different shared machines with multiple GPUs and often use different GPU
ids on different days based on availability. I also occasionally switch between
multi-gpu training and single-gpu training etc. I noticed that the
<strong>nn.DataParallel</strong> class can be a bit tricky to navigate for such usage
conditions, specially if you’re not aware of how models are saved to file, which
I wasn’t at the time.</p>
<p>If you’re training a model on a multi-gpu setup and save the model naively, you
are unknowingly appending a “module” tag to the <em>state_dict</em> elements present in
the model parameters key-value store, and it appears that this assumes some
implicit binding to specific GPUs (I could be wrong?). But if you naively try to
load and run this model on a different multi-gpu setup, you will notice an error
that says a specific tensor is meant to run on a specific GPU. We don’t want
that.</p>
<p>What the error message looks like:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RuntimeError: Expected tensor for argument #1 'input' to have the same device as
tensor for argument #2 'weight'; but device 0 does not equal 1 (while checking
arguments for cudnn_convolution)
</code></pre></div></div>
<p>The easiest suggested fix is to iterate through the model <em>state_dict</em> key-value
store and remove the “module.” binding like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pretrained_model = checkpoint['model']
new_model = SomeNetwork()
from collections import OrderedDict
new_model_dict = OrderedDict()
for k,v in pretrained_model.state_dict().items():
# Drop the "Module." characters from the name
name = k[7:]
new_model_dict[name] = v
new_model.load_state_dict(new_model_dict)
</code></pre></div></div>
<p><a href="https://discuss.pytorch.org/t/solved-keyerror-unexpected-key-module-encoder-embedding-weight-in-state-dict/1686/2">Credit</a></p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduPyTorch Scribble PadJetson Xavier - Initial Thoughts2018-10-03T00:00:00-07:002018-10-03T00:00:00-07:00https://shreyasskandan.github.io/posts/jetsonxavier<p>Ever since the Jetson Xavier was announced, I’ve been itching to get my hands on
one of them to put it through it’s paces. Thanks to James over at <a href="https://www.ghostrobotics.io/">Ghost
Robotics</a> I finally get to play with one of
these. I’ve spent a fair amount of time with the Jetson TX1 and Jetson TX2 and I
will be making direct comparisons to the Xavier’s predecessor, the TX2.</p>
<h1 id="hardware-and-design">Hardware and Design</h1>
<p>Out of the box, the Xavier devkit in no way resembles the previous devkits, and
that’s a good thing because the previous dev kits had limited to no practical
value for what we use them for (mobile robots -
<a href="https://osrf.github.io/ovc/assets/images/ovc1-drone.png">Falcon 250</a> +
<a href="http://open.vision.computer">Open Vision Computer</a>).
The entire physical footprint of the devkit is slightly larger than the actual module, and it appears that it couldn’t
get much smaller even in a tightly packed carrier board (good job @nvidia). However, the first
reaction is to the weight of this unit. It weighs roughly <strong>660gms</strong> out of the box,
without the power supply. Since this is a loaner unit and since I cannot gut the
thing yet, I will guesstimate that most of this weight is the extremely heavy
heat sink and casing. I will update this post once I get my own unit and take
all of that off! The unit is a bit tall too but it’s mostly 70% heatsink and fan
enclosure.</p>
<p><strong>Figure 1: TX2 devkit vs Xavier devkit</strong> (food truck cash card for size comparison)
<img src="/images/IMG_3345.jpg" alt="devkit-comparisons-1" />
<strong>Figure 2: Height Comparison</strong>
<img src="/images/IMG_3346.jpg" alt="devkit-comparisons-2" />
<strong>Figure 3: The incredible bulk</strong>
<img src="/images/IMG_3344.jpg" alt="xavier-weight" />
<strong>Figure 4: Dimensions</strong>
<img src="/images/IMG_3352.jpg" alt="xavier-height" />
<strong>Figure 5: Dimensions</strong>
<img src="/images/IMG_3347.jpg" alt="xavier-width" />
<strong>Figure 6: Under the carrier hood</strong>
<img src="/images/IMG_3349.jpg" alt="xavier-carrier" />
<strong>Figure 7: Power suply</strong>
<img src="/images/IMG_3354.jpg" alt="xavier-ps" /></p>
<p>Adding to the good news, this product seems well build and extremely well protected.
If weight isn’t a problem, I would strap one of these onto a robot directly
without the hassle of manufacturing or buying a separate carrier board.</p>
<p>I would have liked if there was at least another USB Type A port. The
eSATAp+USB3.0 TypeA port is cool but I think most robotics peripherals are still
on Type A and I would have preferred not to bring the battle of dongles into the
robotics world, but oh well. The kind folks at NVIDIA do ship the devkits with
USB-C to Type-A dongles and don’t charge you extra for it (take that @apple).
Apart from the USB-C, the rest of the I/O is similar to the TX2 dev-kits. There’s an
additional M2 which will definitely prove usefull. For those that care, the
power supply adapter is now a bit smaller too. Now, onto the fun stuff..</p>
<h1 id="specifications-and-performance">Specifications and Performance</h1>
<p><strong>CUDA Compatibility Major/Minor version number: 7.2</strong></p>
<p><strong>Multiprocessors: 8</strong> (TX2 has 2)</p>
<p><strong>CUDA Cores/Mp: 64</strong> (TX2 has 128)</p>
<p><strong>Total CUDA Cores: 512</strong> (TX2 has 256)</p>
<p><strong>Global Memory: ~16GB</strong> (TX2 has ~8GB)</p>
<p><strong>GPU Max Frequency: 1500GHz</strong> (TX2 has 1300GHz)</p>
<p><strong>Memory Clock Rate: 1500MHz</strong></p>
<p><strong>Memory Bus Width: 256-bit</strong> (TX2 has 128-bit)</p>
<p><strong>Figure 8: Device Query</strong>
<img src="/images/device_query.png" alt="device-query" /></p>
<p>The CUDA Cores to Multiprocessor ratio is interesting. I will post a more
detailed follow up with actual benchmarks on my code soon. I suspect the Xavier
will be able to better handle multiple CUDA streams and kernel launches because
of this, and that is exciting.</p>
<p>In the CPU realm, the Xavier brings 8 ARMv8 Processor cores, which seem to perform significantly
better than the TX2, where the Denver cores didn’t really make significant
contributions to performance. The CPU max frequency is 2265Hz and I did a little
stress test to see how hot things could get.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo ./jetson_clocks
</code></pre></div></div>
<p>Here is somewhat of a baseline for CPU and GPU temperatures. The device was
idling when these were recorded. These are not freshly booted temperatures.
Those are in the late 30 degres celsius range.</p>
<p><strong>Figure 8: Before CPU stress test:</strong>
<img src="/images/baseline_perf_thermal.png" alt="thermal-baseline" /></p>
<p>Let’s stress it out:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stress --cpu 8 --io 6 --vm 6 --vm-bytes 2048M --timeout 600s
</code></pre></div></div>
<p><strong>Figure 9: Stress temperatures:</strong>
<img src="/images/stress_temp.png" alt="thermal-cpu" /></p>
<p>CPU-bound processes seem to be handled fairly well. I ran the stress test for 10
minutes each a few times and temperatures stayed in the 50s.</p>
<p>To add some fuel to the fire, I threw in a pretty intensive GPU-bound process to
the mix (and dialed back the CPU stress io and vm parameters to 2).</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./nbody_opengles -benchmark -fp64 -fullscreen -numbodies=1000000
</code></pre></div></div>
<p><strong>Figure 10: GPU Stress temperatures:</strong>
<img src="/images/cpugpumax.png" alt="hot-hot-hot" /></p>
<p>Things got hot. Both CPU and GPU internal temperatures began to cross the 70
degree mark. Temperatures remained in the 70s and didn’t appear to increase much
even at 100% CPU and 100% GPU usage.</p>
<p>While the devkit seems to be well cooled, I suspect the Xavier will not take
well to having it’s heatsink, fan and casing thrown away (as we bravely do with
the Falcon 250, but that is an experiment I still intend on performing).</p>
<h1 id="final-thoughts">Final Thoughts</h1>
<p>30-10-02: I think this is a great step forward when compared to the TX2 devkits.
Performance out of the box is impressive. A proper benchmark on existing TX2
code is next on the to-do list along with a more comprehensive thermal analysis
experiment without the fan and heat sink.</p>Shreyas S. Shivakumarsshreyas@seas.upenn.eduEver since the Jetson Xavier was announced, I’ve been itching to get my hands on one of them to put it through it’s paces. Thanks to James over at Ghost Robotics I finally get to play with one of these. I’ve spent a fair amount of time with the Jetson TX1 and Jetson TX2 and I will be making direct comparisons to the Xavier’s predecessor, the TX2.