Dataset Construction

  1. homography estimation
    1. Vanishing Point based auto-calibration
      1. if vp is tilt because of low quality of cctv, the far object’s coordinate error becomes diverge.
    2. Robust Automatic Monocular Vehicle Speed Estimation for Traffic Surveillance (iccv 2021)
      1. Detection + Tracker ( mask r-cnn, byte track)
      2. we have to cherry pick best timeline of the cctv for correct homography estimation
      3. mask the invalid area (in vp)
      4. compute homography (with Jacobian which is output of model)
  2. construct dataset
    1. compute bev map

      image.png

    2. with Jacobian(J), compute orientation(yaw)

      $$ \theta = arctan2(dy, dx) $$

    3. ✅ we assume Z = 0 (road is plane) so we draw a line in the fv, in bev view, the line which is the boundary of road is explode to infinity coordinate. Violation of Planar Assumption

    4. semantic segmentation the fv view → translation with bev view only class == road

how DAIR-V2X constructed?


data construction guidance

convert_cctv.py

  1. sliding window

    1. window_size: 100 frames (50 historical + 50 future)
    2. stride: 50frames (50% overlap between scenarios)
    3. creates multiple scenarios from one continuous recording
  2. target_agent selection

    1. just select most appeared vehicle
    target_id = frame_counts.idxmax()  # Vehicle with most frames
    if target_frame_count < min_target_frames (80):
        continue  # Skip if insufficient data
    
  3. format conversion

    1. just convert format
    # input (raw)
    timestamp,id,type,sub_type,x,y,theta,vx,vy
    0.00,553,vehicle,car,-48.06,179.87,-1.88,0.04,0.25
    
    # output (v2x-seq-tfd format)
    
    city,timestamp,id,type,sub_type,tag,x,y,z,length,width,height,theta,v_x,v_y,intersect_id
    cheonan,75.0,555,VEHICLE,CAR,TARGET_AGENT,-72.35,185.70,0.0,4.5,1.8,1.5,-1.33,-0.01,0.0,CCTV#CCTV051
    
    
  4. change made

    1. add tag column: “TARGET_AGENT” or “OTHERS”
    2. Add dimensions: z=0, length=4.5m, width=1.8m, height=1.5m
  5. train/val split

    1. shuffles all scenarios
    2. split 80% train / 20% validation

preprocess.py

  1. ensure 100 timestamps
  2. identify target_agent