2020 Volume 28 Pages 538-550
Trajectories consist of spatial information of moving objects. Over contious time spans, trajectory data form data streams constantly generated from diverse and geographically distributed sources. Discovery of traveling patterns on trajectory streams such as gathering and companies enables value domain applications. Such a discovery needs to process arrival records in various sources and correlate across records near real-time. Thus techniques for handling trajectory streams should scale on distributed cluster computing. The challenge is at three aspects, namely a data model to represent the continuous trajectory data, the parallelism of the discovery algorithm, and an end-to-end parallel framework. In this paper, we propose a parallel discovery method that consists of 1) a model of partitioning trajectory samples on various time intervals; 2) definition on distance measurements of trajectories; and 3) a parallel discovery algorithm. We build a stream processing workflow and investigate experiments on a public dataset to evaluate the system's performance, scalability, stability, and data intensity. Our method discovers trajectory gathering patterns with low latency and scales as the size of trajectory data grows.