nuImages devkit tutorial

Welcome to the nuImages tutorial. This demo assumes the database itself is available at /data/sets/nuimages, and loads a mini version of the dataset.

A Gentle Introduction to nuImages

In this part of the tutorial, let us go through a top-down introduction of our database. Our dataset is structured as a relational database with tables, tokens and foreign keys. The tables are the following:

  1. log - Log from which the sample was extracted.
  2. sample - An annotated camera image with an associated timestamp and past and future images and pointclouds.
  3. sample_data - An image or pointcloud associated with a sample.
  4. ego_pose - The vehicle ego pose and timestamp associated with a sample_data.
  5. sensor - General information about a sensor, e.g. CAM_BACK_LEFT.
  6. calibrated_sensor - Calibration information of a sensor in a log.
  7. category - Taxonomy of object and surface categories (e.g. vehicle.car, flat.driveable_surface).
  8. attribute - Property of an object that can change while the category remains the same.
  9. object_ann - Bounding box and mask annotation of an object (e.g. car, adult).
  10. surface_ann - Mask annotation of a surface (e.g. flat.driveable surface and vehicle.ego).

The database schema is visualized below. For more information see the schema page.

Google Colab (optional)


Open In Colab

If you are running this notebook in Google Colab, you can uncomment the cell below and run it; everything will be set up nicely for you. Otherwise, manually set up everything.

In [ ]:
# !mkdir -p /data/sets/nuimages  # Make the directory to store the nuImages dataset in.

# !wget https://www.nuscenes.org/data/nuimages-v1.0-mini.tgz  # Download the nuImages mini split.

# !tar -xf nuimages-v1.0-mini.tgz -C /data/sets/nuimages  # Uncompress the nuImages mini split.

# !pip install nuscenes-devkit &> /dev/null  # Install nuImages.

Initialization

To initialize the dataset class, we run the code below. We can change the dataroot parameter if the dataset is installed in a different folder. We can also omit it to use the default setup. These will be useful further below.

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
from nuimages import NuImages

nuim = NuImages(dataroot='/data/sets/nuimages', version='v1.0-mini', verbose=True, lazy=True)
======
Loading nuImages tables for version v1.0-mini...
Done loading in 0.000 seconds (lazy=True).
======

Tables

As described above, the NuImages class holds several tables. Each table is a list of records, and each record is a dictionary. For example the first record of the category table is stored at:

In [2]:
nuim.category[0]
Loaded 25 category(s) in 0.003s,
Out[2]:
{'token': '63a94dfa99bb47529567cd90d3b58384',
 'name': 'animal',
 'description': 'All animals, e.g. cats, rats, dogs, deer, birds.'}

To see the list of all tables, simply refer to the table_names variable:

In [3]:
nuim.table_names
Out[3]:
['attribute',
 'calibrated_sensor',
 'category',
 'ego_pose',
 'log',
 'object_ann',
 'sample',
 'sample_data',
 'sensor',
 'surface_ann']

Indexing

Since all tables are lists of dictionaries, we can use standard Python operations on them. A very common operation is to retrieve a particular record by its token. Since this operation takes linear time, we precompute an index that helps to access a record in constant time.

Let us select the first image in this dataset version and split:

In [4]:
sample_idx = 0
sample = nuim.sample[sample_idx]
sample
Loaded 50 sample(s) in 0.008s,
Out[4]:
{'token': '09acd654cb514bdeab8e3afedad74fca',
 'timestamp': 1535352274870176,
 'log_token': '4ed5d1230fcb48d39db895f754e724f9',
 'key_camera_token': 'e2f18fef9da44bf1ad9682e18b1b9f22',
 'key_lidar_token': '994740c50ebb455181126cd35020f1fe'}

We can also get the sample record from a sample token:

In [5]:
sample = nuim.get('sample', sample['token'])
sample
Out[5]:
{'token': '09acd654cb514bdeab8e3afedad74fca',
 'timestamp': 1535352274870176,
 'log_token': '4ed5d1230fcb48d39db895f754e724f9',
 'key_camera_token': 'e2f18fef9da44bf1ad9682e18b1b9f22',
 'key_lidar_token': '994740c50ebb455181126cd35020f1fe'}

What this does is actually to lookup the index. We see that this is the same index as we used in the first place.

In [6]:
sample_idx_check = nuim.getind('sample', sample['token'])
assert sample_idx == sample_idx_check

From the sample, we can directly access the corresponding keyframe sample data. This will be useful further below.

In [7]:
key_camera_token = sample['key_camera_token']
print(key_camera_token)
e2f18fef9da44bf1ad9682e18b1b9f22

Lazy loading

Initializing the NuImages instance above was very fast, as we did not actually load the tables. Rather, the class implements lazy loading that overwrites the internal __getattr__() function to load a table if it is not already stored in memory. The moment we accessed category, we could see the table being loaded from disk. To disable such notifications, just set verbose=False when initializing the NuImages object. Furthermore lazy loading can be disabled with lazy=False.

Rendering

To render an image we use the render_image() function. We can see the boxes and masks for each object category, as well as the surface masks for ego vehicle and driveable surface. We use the following colors:

  • vehicles: orange
  • bicycles and motorcycles: red
  • pedestrians: blue
  • cones and barriers: gray
  • driveable surface: teal / green

At the top left corner of each box, we see the name of the object category (if with_category=True). We can also set with_attributes=True to print the attributes of each object (note that we can only set with_attributes=True to print the attributes of each object when with_category=True). In addition, we can specify if we want to see surfaces and objects, or only surfaces, or only objects, or neither by setting with_annotations to all, surfaces, objects and none respectively.

Let us make the image bigger for better visibility by setting render_scale=2. We can also change the line width of the boxes using box_line_width. By setting it to -1, the line width adapts to the render_scale. Finally, we can render the image to disk using out_path.

In [8]:
nuim.render_image(key_camera_token, annotation_type='all',
                  with_category=True, with_attributes=True, box_line_width=-1, render_scale=5)
Loaded 1300 sample_data(s) in 0.013s,
Loaded 58 surface_ann(s) in 0.002s,
Loaded 506 object_ann(s) in 0.003s,
Loaded 12 attribute(s) in 0.000s,

Let us find out which annotations are in that image.

In [9]:
object_tokens, surface_tokens = nuim.list_anns(sample['token'])
Printing object annotations:
06eed0ca8b164b84bbb2851de1ed2c13 vehicle.car ['vehicle.moving']
0e8ba57c7b69482c88319f5c1b4deeb0 movable_object.trafficcone []
11ec9a46540443339e2e38afbe31f7b1 human.pedestrian.adult ['pedestrian.standing']
4b27e4a70d464cb2a2f33d5dbcf85094 human.pedestrian.adult ['pedestrian.moving']
4c76bc9ee7da40668f1d4b294209ae3b human.pedestrian.adult ['pedestrian.standing']
4e61ccd6905644adb0556e1f336cee79 movable_object.barrier []
584cb4bd0e7c4a0b8b1169191ca828a1 vehicle.car ['vehicle.moving']
677a87b7df1a4ee7a7a36bab569cccbd human.pedestrian.adult ['pedestrian.moving']
683e330396134c6393fd77187194990c human.pedestrian.adult ['pedestrian.moving']
82e0c68c0f2440bcb041a51a6f116513 human.pedestrian.adult ['pedestrian.moving']
8dc2b24b1a69434a8aade0cb4e308e8e vehicle.car ['vehicle.moving']
924572ff00404ae59d1ee2f6f6c92274 human.pedestrian.adult ['pedestrian.moving']
9b8ea679730b43d7b6631ceeb56e0ccf human.pedestrian.adult ['pedestrian.moving']
a457fc08800444bc83900e3a12b00619 movable_object.barrier []
c4308276d2ab463b9aca936c4c5d1dfb vehicle.bus.rigid ['vehicle.moving']
d287a7310b0a44cda3aa75215cdb676a human.pedestrian.adult ['pedestrian.moving']

Printing surface annotations:
f573eaa17a595521a39c4116f05a6f58 flat.driveable_surface

We can see the object_ann and surface_ann tokens. Let's again render the image, but only focus on the first object and the first surface annotation. We can use the object_tokens and surface_tokens arguments as shown below. We see that only one car and the driveable surface are rendered.

In [10]:
nuim.render_image(key_camera_token, with_category=True, object_tokens=[object_tokens[0]], surface_tokens=[surface_tokens[0]])

To get the raw data (i.e. the segmentation masks, both semantic and instance) of the above, we can use get_segmentation().

In [11]:
import matplotlib.pyplot as plt

semantic_mask, instance_mask = nuim.get_segmentation(key_camera_token)

plt.figure(figsize=(32, 9))

plt.subplot(1, 2, 1)
plt.imshow(semantic_mask)
plt.subplot(1, 2, 2)
plt.imshow(instance_mask)

plt.show()

Every annotated image (keyframe) comes with up to 6 past and 6 future images, spaced evenly at 500ms +- 250ms. However, a small percentage of the samples has less sample_datas, either because they were at the beginning or end of a log, or due to delays or dropped data packages. list_sample_content() shows for each sample all the associated sample_datas.

In [12]:
nuim.list_sample_content(sample['token'])
Listing sample content...
Rel. time	Sample_data token
     -3.0	cd20ce48c6ad4342ba04df21e405bfbc
     -2.5	adcc9c33407240e292b690c984073ebb
     -2.0	6dcfa14607384d5db7007e050af47ccd
     -1.5	1ad4cf343bdf4b5382f687cd03134377
     -1.0	062c644c88f542159176a892b4cb1544
     -0.5	8ae1c97b38894e668aa8da379a7c1aea
      0.0	e2f18fef9da44bf1ad9682e18b1b9f22
      0.5	b0d6c7204ece4c938140fb2681f6bfab
      1.0	38dcdd86fce74a6387806b66b607653e
      1.5	f590922e422c45eaa3b3228a90efa757
      2.1	2621cdd5420f4cc7b3878bedf84ad8c5
      2.5	b353f0a729df4ab69e98f61b9abb6427
      3.0	dd8c46d2592b4ed98d17eefc51cc209f

Besides the annotated images, we can also render the 6 previous and 6 future images, which are not annotated. Let's select the next image, which is taken around 0.5s after the annotated image. We can either manually copy the token from the list above or use the next pointer of the sample_data.

In [13]:
next_camera_token = nuim.get('sample_data', key_camera_token)['next']
next_camera_token
Out[13]:
'b0d6c7204ece4c938140fb2681f6bfab'

Now that we have the next token, let's render it. Note that we cannot render the annotations, as they don't exist.

Note: If you did not download the non-keyframes (sweeps), this will throw an error! We make sure to catch it here.

In [14]:
try:
    nuim.render_image(next_camera_token, annotation_type='none')
except Exception as e:
    print('As expected, we encountered this error:', e)
As expected, we encountered this error: Error: Cannot render annotations for non keyframes!

In this section we have presented a number of rendering functions. For convenience we also provide a script render_images.py that runs one or all of these rendering functions on a random subset of the 93k samples in nuImages. To run it, simply execute the following line in your command line. This will save image, depth, pointcloud and trajectory renderings of the front camera to the specified folder.

>> python nuimages/scripts/render_images.py --mode all --cam_name CAM_FRONT --out_dir ~/Downloads/nuImages --out_type image

Instead of rendering the annotated keyframe, we can also render a video of the 13 individual images, spaced at 2 Hz.

>> python nuimages/scripts/render_images.py --mode all --cam_name CAM_FRONT --out_dir ~/Downloads/nuImages --out_type video

Poses and CAN bus data

The ego_pose provides the translation, rotation, rotation_rate, acceleration and speed measurements closest to each sample_data. We can visualize the trajectories of the ego vehicle throughout the 6s clip of each annotated keyframe. Here the red x indicates the start of the trajectory and the green o the position at the annotated keyframe. We can set rotation_yaw to have the driving direction at the time of the annotated keyframe point "upwards" in the plot. We can also set rotation_yaw to None to use the default orientation (upwards pointing North). To get the raw data of this plot, use get_ego_pose_data() or get_trajectory().

In [15]:
nuim.render_trajectory(sample['token'], rotation_yaw=0, center_key_pose=True)
Loaded 1300 ego_pose(s) in 0.015s,

Statistics

The list_*() methods are useful to get an overview of the dataset dimensions. Note that these statistics are always for the current split that we initialized the NuImages instance with, rather than the entire dataset.

In [16]:
nuim.list_logs()
Loaded 44 log(s) in 0.008s,

Samples Log                           Location                
     1 n003-2018-01-03-12-03-23+0800 singapore-onenorth      
     1 n003-2018-01-04-11-23-25+0800 singapore-onenorth      
     1 n003-2018-01-08-11-30-34+0800 singapore-onenorth      
     1 n003-2018-07-12-15-40-35+0800 singapore-onenorth      
     1 n004-2018-01-04-11-05-42+0800 singapore-onenorth      
     2 n005-2018-06-14-20-11-03+0800 singapore-onenorth      
     1 n006-2018-09-17-12-15-45-0400 boston-seaport          
     1 n008-2018-03-14-15-16-29-0400 boston-seaport          
     3 n008-2018-05-21-11-06-59-0400 boston-seaport          
     1 n008-2018-05-30-15-20-59-0400 boston-seaport          
     2 n008-2018-05-30-16-31-36-0400 boston-seaport          
     1 n008-2018-06-04-16-30-00-0400 boston-seaport          
     1 n008-2018-09-18-14-18-33-0400 boston-seaport          
     1 n009-2018-05-08-15-52-41-0400 boston-seaport          
     1 n009-2018-09-12-09-59-51-0400 boston-seaport          
     1 n010-2018-07-05-14-36-33+0800 singapore-onenorth      
     1 n010-2018-07-06-11-01-46+0800 singapore-onenorth      
     1 n010-2018-08-27-12-00-23+0800 singapore-onenorth      
     1 n010-2018-08-27-15-10-53+0800 singapore-onenorth      
     1 n010-2018-09-17-15-57-10+0800 singapore-onenorth      
     1 n013-2018-08-01-16-46-39+0800 singapore-onenorth      
     1 n013-2018-08-02-13-54-05+0800 singapore-onenorth      
     1 n013-2018-08-02-14-08-14+0800 singapore-onenorth      
     1 n013-2018-08-03-14-44-49+0800 singapore-onenorth      
     2 n013-2018-08-16-16-15-38+0800 singapore-onenorth      
     2 n013-2018-08-20-14-38-24+0800 singapore-onenorth      
     1 n013-2018-08-21-11-46-25+0800 singapore-onenorth      
     1 n013-2018-08-27-11-35-10+0800 singapore-onenorth      
     1 n013-2018-08-27-14-41-26+0800 singapore-onenorth      
     1 n013-2018-08-27-15-47-08+0800 singapore-onenorth      
     1 n013-2018-08-27-16-40-42+0800 singapore-onenorth      
     1 n013-2018-08-28-16-04-27+0800 singapore-onenorth      
     1 n013-2018-08-29-11-41-15+0800 singapore-onenorth      
     1 n013-2018-08-29-14-19-16+0800 singapore-onenorth      
     1 n013-2018-09-03-14-54-42+0800 singapore-onenorth      
     1 n013-2018-09-04-13-30-50+0800 singapore-onenorth      
     1 n014-2018-06-25-21-03-46-0400 boston-seaport          
     1 n015-2018-09-05-12-12-46+0800 singapore-onenorth      
     1 n015-2018-09-13-15-25-57+0800 singapore-onenorth      
     1 n015-2018-09-19-11-19-35+0800 singapore-onenorth      
     1 n016-2018-07-04-10-44-39+0800 singapore-onenorth      
     1 n016-2018-07-06-12-06-18+0800 singapore-queenstown    
     1 n016-2018-07-10-11-22-35+0800 singapore-onenorth      
     1 n016-2018-07-10-16-55-57+0800 singapore-onenorth      

list_categories() lists the category frequencies, as well as the category name and description. Each category is either an object or a surface, but not both.

In [17]:
nuim.list_categories(sort_by='object_freq')
Object_anns Surface_anns Name                     Description                                     
        189            0 human.pedestrian.adult   Adult subcategory.                              
        122            0 vehicle.car              Vehicle designed primarily for personal use, e.g
         70            0 movable_object.barrier   Temporary road barrier placed in the scene in or
         44            0 movable_object.trafficco All types of traffic cone.                      
         28            0 vehicle.truck            Vehicles primarily designed to haul cargo includ
         14            0 vehicle.bicycle          Human or electric powered 2-wheeled vehicle desi
         14            0 vehicle.motorcycle       Gasoline or electric powered 2-wheeled vehicle d
          6            0 human.pedestrian.constru Construction worker                             
          5            0 vehicle.bus.rigid        Rigid bus subcategory.                          
          5            0 vehicle.construction     Vehicles primarily designed for construction. Ty
          3            0 human.pedestrian.persona A small electric or self-propelled vehicle, e.g.
          2            0 movable_object.pushable_ Objects that a pedestrian may push or pull. For 
          2            0 vehicle.trailer          Any vehicle trailer, both for trucks, cars and b
          1            0 movable_object.debris    Movable object that is left on the driveable sur
          1            0 static_object.bicycle_ra Area or device intended to park or secure the bi
          0           49 flat.driveable_surface   Surfaces should be regarded with no concern of t
          0            9 vehicle.ego              Ego vehicle.                                    

We can also specify a sample_tokens parameter for list_categories() to get the category statistics for a particular set of samples.

In [18]:
sample_tokens = [nuim.sample[9]['token']]
nuim.list_categories(sample_tokens=sample_tokens)
Object_anns Surface_anns Name                     Description                                     
          3            0 movable_object.barrier   Temporary road barrier placed in the scene in or
          1            0 human.pedestrian.constru Construction worker                             
          1            0 vehicle.car              Vehicle designed primarily for personal use, e.g
          1            0 vehicle.construction     Vehicles primarily designed for construction. Ty
          1            0 vehicle.truck            Vehicles primarily designed to haul cargo includ
          0            1 flat.driveable_surface   Surfaces should be regarded with no concern of t

list_attributes() shows the frequency, name and description of all attributes:

In [19]:
nuim.list_attributes(sort_by='freq')
Annotations Name                     Description                                     
        100 pedestrian.moving        The human is moving.                            
         81 vehicle.parked           Vehicle is stationary (usually for longer durati
         66 vehicle.moving           Vehicle is moving.                              
         54 pedestrian.standing      The human is standing.                          
         41 pedestrian.sitting_lying The human is sitting or lying down.             
         24 cycle.without_rider      There is NO rider on the bicycle or motorcycle. 
         15 vehicle.stopped          Vehicle, with a driver/rider in/on it, is curren
          7 cycle.with_rider         There is a rider on the bicycle or motorcycle.  
          0 vehicle_light.emergency. Vehicle is flashing emergency lights.           
          0 vehicle_light.emergency. Vehicle is not flashing emergency lights.       
          0 vertical_position.off_gr Object is not on the ground plane, e.g. flying, 
          0 vertical_position.on_gro Object is on the ground plane.                  

list_cameras() shows us how many camera entries and samples there are for each channel, such as the front camera. Each camera uses slightly different intrinsic parameters, which will be provided in a future release.

In [20]:
nuim.list_cameras()
Loaded 94 calibrated_sensor(s) in 0.008s,
Loaded 7 sensor(s) in 0.001s,

Calibr. sensors Samples Channel                  
              5       5 CAM_FRONT_LEFT           
              9       9 CAM_BACK_RIGHT           
              7       7 CAM_FRONT_RIGHT          
             44      50 LIDAR_TOP                
             10      10 CAM_BACK_LEFT            
              8       8 CAM_FRONT                
             11      11 CAM_BACK                 

list_sample_data_histogram() shows a histogram of the number of images per annotated keyframe. Note that there are at most 13 images per keyframe. For the mini split shown here, all keyframes have 13 images.

In [21]:
nuim.list_sample_data_histogram()
Listing sample_data frequencies..
# images	# samples
      26	50