Published Paper in ICECCME 2022
"The 2nd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME) is the premier event that brings together industry professionals, academics, and engineers from the related institutions to exchange information and ideas on electrical, computer, communications and mechatronic engineering. The conference will feature a comprehensive technical program offering numerous technical sessions with papers showcasing the latest technologies, and applications."
I will be presenting my paper titled "Depth Maps Comparisons from Monocular Images by
MiDaS Convolutional Neural Networks and Dense Prediction Transformers" in the International Conference on
Electrical, Computer, Communications and Mechatronics Engineering (ICECCME) conference held in Maldives in
November 2022. A link to my recorded presentation and published paper will be available soon.
This paper was based on the first half of my MSDS Capstone Project - see MSDS tab for more
information. The research was a comparative case study on inferring depth maps from monocular images using two Deep
Learning frameworks: (1) Convolutional Neural Network (CNN) and (2) Dense Prediction Transformers (DPT). Overall my analysis
found the Hybrid-DPT model to exhibit the similar computational efficiency as the CNN (known for their computational efficiency),
while maintaining accurate representations of 3D geometries like the Large-DPT model. Accurate representations of 3D geometries
was analyzed by taking a statistical measurement for outliers vs inliers, where an outlier was computed by exceeding the average distance
of 50 nearby neighbors with a threshold of 2.0 standard deviations.
Important to note: every image is a 2-dimensional representation of our real 3-dimensional world. Every camera
has a pinhole camera model that translates the real world into the image you see. This paper briefly describes reverse engineering that process
by taking information from the 3x3 intrinsic camera matrix and forming a 4x4 Q-matrix, which provides pixels on the image (X,Y) a mapping to
occupy a virtual 3-dimensional space. In summary, real-world 3D points are captured in a 2D image and reprojected back into a virtual 3D space.
| ICECCME presentation: | COMING SOON! |
| ICECCME published paper: | COMING SOON! |