Dr. Y. Bandung, S.T., M.T.
Sekolah Teknik Elektro dan Informatika
Dr. Kusprasapta Mutijarsa, S.T., M.T.
Sekolah Teknik Elektro dan Informatika
Prof. Ir. Armein Z. R. Langi, M.Sc., Ph.D.
Sekolah Teknik Elektro dan Informatika
Dr. Fetty Fitriyanti Lubis, S.T., M.T.
Sekolah Teknik Elektro dan Informatika
Yusep Rosmansyah S.T., M.Sc., Ph.D.
Sekolah Teknik Elektro dan Informatika
Abstract
The Multimedia Internet of Things (M-IoT) focuses on processing multimedia data, including audio, video, and images, for applications like agriculture, surveillance, smart homes, health monitoring, and traffic management. M-IoT systems use a threelayer architecture: sensing, communication, and application. However, they face challenges such as limited computing resources in sensor devices, network constraints, and scalability issues. The integration of video into M-IoT systems, known as the Internet of Video Things (IoVT), brings additional challenges like data compression, processing, transmission, and security. This study proposes a machine learning-based frame resolution adjustment system to optimize video delivery in IoT environments. The system includes a throughput predictor and a file size estimator to dynamically adjust video resolution, ensuring consistent frame rates despite network fluctuations. Using the Constrained Application Protocol (CoAP) for efficient data transmission, the system was implemented on resource-constrained devices like Raspberry Pi and ESP-EYE. Experiments revealed that the SES method provided the most accurate throughput predictions with minimal execution time. The system successfully maintained video quality by predicting throughput and file size and adjusting video resolution accordingly. This approach improves video delivery performance in IoT environments while addressing challenges like unstable network conditions and resource limitations in sensor devices.
Introduction
Multimedia Internet of Things (M-IoT) is a growing IoT trend focused on multimedia data processing, such as audio, video, and images, with applications in agriculture, surveillance, smart homes, health monitoring, industry, and traffic management. Built on a three-layer architecture sensing, communication, and application-M-IoT faces challenges in data computing and communication due to the limited resources of sensor devices and network constraints. The Internet of Video Things (IoVT) emerges as a specialized domain within MIoT, involving video delivery and facing issues such as packet loss, jitter, and frame resolution adjustment, especially in unstable network conditions.
IoVT integrates Artificial Intelligence (AI) and machine learning to overcome challenges like predicting throughput and adjusting video resolution dynamically to maintain a stable frame rate. This study investigates a machine learningbased fra,me resolution adjustment system to improve •vi eo delivery in IoT environments. The system comprises two subsystems: a throughput predictor implemented on a local server (e.g., SBCs like ESP-EYE) and a file size estimator on multimedia sensor devices. The throughput pred:iGt0 orecasts future network conditions, while the file size estimator estimates video frame sizes. Together, these components adjust frame resolution to ensure consistent frame rates during transmission.
The study conducted statistical comparisons of machine learning algorithms to identify the optimal predictor for throughput and developed a CoAPbased IoT protocol for efficient video delivery. Experimental results validated the system’s ability to predict throughput and frame sizes accurately, ensuring smooth video quality under varying network conditions. Key contributions include a comprehensive analysis of machine learning algorithms, the development of a throughput predictor and file size estimator, and the integration of these into a frame resolution adjustment system. The findings support the potential of AI-powered IoT systems to enhance multimedia data transmission, paving the way for scalable, highquality IoVT applications in diverse scenarios.
Research Method
The research workflow is divided into five phases:
- Planning: Comprehensive research planning, including a proposal, timeline, and budget. Literature Review: Studying existing technologies through literature review and preliminary experiments, yielding insights into current methods.
- Method Development: Designing methods and Conclusion protocols to enhance multimedia data transmission in Wireless Sensor Networks (WSNs).
- Data Collection: Preparing scenarios and gathering data for evaluation, focusing on multimedia data transmission across various WSN scenarios.
- Evaluation: Analyzing collected data to assess the success of the developed methods and protocols.
Without the system, throughput directly follows network capacity, causing the frame rate to drop during poor network conditions. However, with the system in place, the frame rate remains stable, balancing video quality and network performance effectively. This ensures an optimal viewing experience by maintaining frame rates within acceptable limits, regardless of network challenges.
Conclusion
To optimize the system, experiments were conducted on various machine learning methods to predict data throughput effectively. The SES method outperformed other models, such as ARIMA variants and LSTM, in prediction accuracy, as measured by RMSE, MAE, MAPE, and R2 values. Additionally, the SES method had the fastest execution time of just 0.018 seconds The throughput prediction, implemented on a Raspberry Pi server, was combined with a file size estimator that demonstrated a mean error rate of 6.73%. Together, these tools enabled the system to adjust frame resolution dynamically, maintaining a frame rate of five fps even under network congestion end resource constraints.
Currently, the system supports only MJPEG encoding due to device limitations, but future work could explore energy-efficient encoding formats like low-power H.264. Further research might involve alternative power sources such as batteries or solar panels, and more advanced machine learning models to enhance throughput prediction accuracy in IoT settings. Other potential enhancement include using machine learning to predict motion for real-time resolution adjustments and developing interface for client video request.