Utilizing MapReduce to Improve Probe-Car Track Data Mining
<p>Technical architecture of a probe-car system.</p> "> Figure 2
<p>(<b>a</b>) Global Positioning System (GPS) track; (<b>b</b>) track point matching.</p> "> Figure 3
<p>Main processes of MapReduce.</p> "> Figure 4
<p>Workflow for the first map algorithm.</p> "> Figure 5
<p>Second map algorithm flow.</p> "> Figure 6
<p>Vehicle tracks.</p> "> Figure 7
<p>Different sizes of the divided region (m × n).</p> "> Figure 8
<p>Vehicle track.</p> "> Figure 9
<p>Lane path.</p> ">
Abstract
:1. Introduction
2. Literature Review and Key Issues of Mining Probe-Car Data
2.1. Literature Review
2.2. Probe-Car Collection Interval
2.3. The Key Issues in Probe-Car Data Mining
- (1)
- Position information (latitude and longitude).
- (2)
- Position information with noise, and the noise is affected by a variety of factors (GPS noise, clouds, the status of buildings nearby, indoor and outdoor conditions, and so forth).
- (3)
- Loss of spatial information: the sampling interval of probe-car tracks is usually long (tens of seconds or minutes), which will result in the loss of shape information.
- (4)
- Spatial information redundancy caused by intensive sampling and low speed.
- (5)
- The temporal correlation between some series of track points.
- (6)
- Additional property information: incidental event information, driving behavior information, sensor parameters while sampling points.
- (1)
- The close correlation of spatial data. Since the electronic map of the road network topology has a close correlation, it often needs to load all of the road network data into memory to handle all the tracks. Since the road network data is massive, it demands high performance for the hardware. Moreover, it results in low performance for searching the matching data in the entire road network to process each specific track, which is a fatal flaw for the massive spatial data mining of the probe-car track.
- (2)
- True path restoration for the probe-car track. It is crucial to combine the electronic map to restore the true path of probe-car tracks as accurately as possible. Since the sampling interval is sparse, the distance between two adjacent track points might be far. Due to data noise, the position of each track point may greatly deviate from the position of the real road, so large errors result from the conventional method, which matches the best roads according to each track point location on an electronic map. As shown in Figure 2a, if it is a partial match, a part marked I in the whole track will be matched to road 1. However, the whole track should be matched to road 2, as shown in Figure 2b. Therefore, combining the spatial characteristics of the entire track optimally and globally restores the path, while the local single track point cannot be matched to the best road.
3. Methodology
3.1. MapReduce Parallel Distributed Computing Model
3.2. MapReduce Method for Data Mining in Probe-Car Tracks
Enter the range A1 to be processed; A2 = A1 + 0.1 ° // Extend the range A1 as A2 M = LoadData (A2); // Load the map data of the range A2 into memory Enter the track collection D While(D != NULL) { P = Read(D); // Read one track in M in order if (P ∈ A1) // Track P belongs to the range A1 { Match track P with the electronic map M; If (matched) { log(P and M match relations); } else { log (non-matched trajectory P); } } else if (Part of P belongs to A1) / / Track P is cross-regional { log(Track P, the latitude, and the longitude of track P); } D = D – P; }
Enter the range A1 to be processed; Enter the set D of the cross-regional tracks to be processed Read the latitude and longitude range A2 in D; A3 = A1 + A2; M = LoadData(A3); // Load the map data of A3 into memory While(D != NULL) { P = Read(D); // Read one of the cross-region tracks in M in order Match track P with the electronic map M; If(matched) { log(P and M match relations); } else { log (non-matched trajectory P); } D = D – P; } while (log != NULL) { l = Read(log); // read a log in the log, matched relationship or non-matched track P if (l not in ControlServe) { l in the ControlServe; } log = log − l; }
4. Experimental Results and Discussion
4.1. Experiment Scenarios
4.2. Results
- (1)
- n and m should not be too small, or the proportion of the tracks across the regions will be very significant, which cannot reflect the effect of MapReduce. n and m should be chosen so that the cross-regional trajectory ratio is kept around 25%. m × n is limited by memory limitations.
- (2)
- n and m should not be too large. If m × n is larger, consumption memory and computing resources will be greater. The maximum of m × n is limited by the resources a single computing unit can provide.
- (3)
- Within a single computing unit resource, it is not necessarily good to have larger m × n values, but it would be better to have a greater range for m × n and a lower marginal effect of the inter-regional track proportion. To meet the conditions in which the cross-regional track ratio is below a certain range, the smaller m × n, the better.
5. Conclusions
Author Contributions
Acknowledgments
Conflicts of Interest
Nomenclature
ITS | Intelligent Transportation System |
GPS | Global Positioning System |
OEM | Original Equipment Manufacturer |
GPRS | General Packet Radio Service |
VICS | Vehicle Information and Communication System |
AATCC | Automobile Association Traffic Control Centre |
FHA | Federal Highway Administration |
ITA | Illinois Transportation Authority |
KRSA | Korean Road Safety Association |
GSCTI | German Space Center Transportation Institute |
KORTIC | Korea Road Traffic Information Center |
CCTV | Closed-Circuit Television |
POIs | Point of Interests |
References
- China Car Survey and Market Prospect Forecast Report. Available online: http://www.askci.com (accessed on 31 December 2017).
- Geng, X.; Wang, S.; JI, J. Fast road-matching algorithm of probe-car. J. Water Resour. Archit. Eng. 2013, 11, 122–125. [Google Scholar]
- Hao, Y.; Wu, G.; Zhou, S. A multi-vehicle speed fusion algorithm based on probe vehicle data. J. Transp. Inf. Saf. 2012, 30, 56–61. [Google Scholar]
- Zhu, T.; Guo, S. A study on floating car based information processing technology. J. Image Graph. 2009, 14, 1230–1237. [Google Scholar]
- Boyce, D.E.; Kirson, A.; Schofer, J.L. Design and implementation of advance: The Illinois dynamic navigation and route guidance demonstration program. In Proceedings of the Vehicle Navigation and Information Systems Conference, Dearborn, MI, USA, 20–23 October 1991; pp. 415–426. [Google Scholar]
- Cowan, K.; Gates, G. Floating vehicle data system—Realization of a commercial system. In Proceedings of the 11th International Conference on Road Transport Information and Control, London, UK, 19–21 March 2002; pp. 187–189. [Google Scholar]
- Pang, H. Research on Key Technologies of Urban Dynamic Traffic Guidance Systems Based on FCD. Master’s Thesis, University of Science and Technology of China, Hefei, China, 2009. [Google Scholar]
- Wang, Y. Research on the Key Technology of Large-Scale Strategic Traffic Coordination & Control System. Ph.D. Thesis, Jilin University, Changchun, China, 2009. [Google Scholar]
- Vehicle Information and Communication System Center. Introduction of VICS Ver. 2010; Vehicle Information and Communication System Center: Tokyo, Japan, 2010. [Google Scholar]
- Han, W.; Choi, K.K. An implementation of a Korea traffic information center over metropolitan Seoul region. In Proceedings of the Mobility for Everyone World Congress on Intelligent Transport Systems, Berlin, Germany, 21–24 October 1997; pp. 21–24. [Google Scholar]
- Schafer, R.; Thiessenhusen, K.; Wagner, P. A traffic information system by means of real-time floating-car data. In Proceedings of the ITS World Congress, Chicago, IL, USA, 14–17 October 2002; pp. 1–8. [Google Scholar]
- Jan, F.E.; Stephan, M.; Stefan, E.; Dirk, C.M. Data chain management for planning in city logistics. Int. J. Data Min. Model. Manag. 2009, 1, 335–356. [Google Scholar]
- System and Method for Realtime Community Information Exchange. Available online: https://www.cbinsights.com/company/waze-patents (accessed on 1 March 2016).
- Li, X.; Meng, Q. The applications of GPS technology in the Real-Time detection of city traffic condition to GPS. J. Ocean Univ. Qingdao 2002, 32, 475–481. [Google Scholar]
- Dong, J.; Wu, J.; Guo, J. Assessment of road network based on GPS/GIS data:a practice in Beijing. City Plan. Rev. 2005, 29, 70–74. [Google Scholar]
- Zhang, Z.; Lin, X.; Lin, S. Traffic parameter features in traffic incidents based on probe-car data. J. Transp. Inf. Saf. 2011, 29, 94–98. [Google Scholar]
- Xin, F.; Chen, X.; Lin, H. Research on time space distribution characteristics of probe-car data in road network. China J. Highw. Transp. 2008, 4, 105–110. [Google Scholar]
- Li, Q.; Yin, J.; He, F. A coverage rate model of GPS probe-car for road networks. Geomat. Inf. Sci. Wuhan Univ. 2009, 34, 715–718. [Google Scholar]
- Zhang, C. Research on the Traffic Data Collection and Data Processing Theory and Method Based on Probe-Car. Ph.D. Thesis, Tongji University, Shanghai, China, 2007. [Google Scholar]
- Weng, J.; Zhou, X.; Zhai, Y. Applications of probe-car data in urban macroscopic traffic character study. J. Wuhan Univ. Technol. (Transp. Sci. Eng.) 2008, 32, 806–809. [Google Scholar]
- Guo, J.; Wen, H.; Chen, F. Function analysis and application design of floating car system. J. Transp. Syst. Eng. Inf. Technol. 2007, 7. [Google Scholar] [CrossRef]
- Yamane, K.; Endo, Y.; Fujiwara, J.; Kumagai, M. Estimation of statistical traffic data for navigation system. Int. J. ITS Res. 2004, 2, 1–9. [Google Scholar]
- Gou, X.; Zuo, X.; Zhang, Y. Application of digital velocity model in urban traffic dynamics analysis based on probe-car. Sci. Technol. Eng. 2013, 13, 3172–3177. [Google Scholar]
- Xie, J. Design of Vehicle Multimedia Navigation System Based on AU1200. Master’s Thesis, Zhejiang University, Hangzhou, China, 2008. [Google Scholar]
- Schroedl, S.; Wagstaff, K.; Rogers, S.; Langley, P.; Wilson, C. Mining GPS Traces for Map Refinement. Data Min. Knowl. Discov. 2004, 9, 59–87. [Google Scholar] [CrossRef]
- Zheng, K.; Song, X.; Zhu, D. A New Method of Trajectory Restoration at Intersection. TELKOMNIKA 2015, 13, 563–570. [Google Scholar] [CrossRef]
- Dean, J.; Ghemawat, S. Map/Reduce advantages overparallel databases include storage-system independence and fine-grain fault tolerance for large jobs. Commun. ACM 2010, 53, 72–77. [Google Scholar] [CrossRef]
- Dean, J.; Ghemawat, S. Map/Reduce: Simplified data processing on large clusters. Commun. ACM 2008, 51, 107–113. [Google Scholar] [CrossRef]
- Afrati, F.N.; Ullman, J.D. Optimizing joins in a map-reduce environment. IEEE Trans. Knowl. Data Eng. 2011, 23, 1282–1298. [Google Scholar] [CrossRef]
- Biswapesh, C.; Liang, L.T. A SQL implementation on the MapReduce framework. In Proceedings of the VLDB Endowment, Athens, Greece, 12–16 June 2011; Volume 4, pp. 1318–1327. [Google Scholar]
- Fang, S.; Zhou, J.; Zhang, M. Research of improvement selection algorithm in cloud-computing web data mining based on Map/Reduce. Appl. Res. Comput. 2013, 30, 377–379. [Google Scholar]
- Dean, J. Experiences with MapReduce: An abstraction for Large-scale computation. In Proceedings of the IEEE 15th International Conference on Parallel Architectures and Compilation Techniques, Seattle, WA, USA, 16–20 September 2006. [Google Scholar]
- Anand, R.; Jeffrey, D.U.; Wang, B. Key Technologies and Application Research of Cloud Computing; People’s Posts and Telecommunications Press: Beijing, China, 2012. [Google Scholar]
System | Country | Developer | Main Characteristics/Data Source |
---|---|---|---|
Probe-car data | UK | ITIS Holdings Plc | From AATCC and real-time and historical data |
ADVANCE system | USA | FHA, ITA, and other agencies | Avoid traffic congestion, improve driving quality |
VICS | Japan | \ | From GPS navigation devices |
KORTIC | Korea | Korean Road Safety Association | From the toroidal coil, GPS probe-cars, and so on |
Probe-car Data System | Germany | Space Center Transportation Institute | From floating taxis |
Google Maps | USA | A human element is added to its traffic calculations, report traffic incidents from drivers |
Number of Probe-Car Tracks | Number of Vehicles | Number of Roads in the Electronic Map | Total Road Length (km) |
---|---|---|---|
2,056,489 | 51,973 | 12,538,343 | 21,453,863 |
RecordID | GpsDateTime | GpsLatitude | GpsLongitude | GpsAzimuth | GpsSpeed |
---|---|---|---|---|---|
204451900001 | 2017/8/31 4:24 | 38.45707628 | 141.2965907 | 49° | 39.6 km/h |
204451900002 | 2017/8/31 4:25 | 38.45992784 | 141.3001752 | 45° | 37.8 km/h |
204451900003 | 2017/8/31 4:33 | 38.46099555 | 141.3012858 | 347° | 9.9 km/h |
204451900004 | 2017/8/31 4:34 | 38.4589209 | 141.2973367 | 164° | 45 km/h |
204451900005 | 2017/8/31 4:35 | 38.46034831 | 141.2877371 | 166° | 57.6 km/h |
204451900006 | 2017/8/31 4:36 | 38.46274794 | 141.2803071 | 155° | 10 km/h |
204451900007 | 2017/8/31 4:37 | 38.46221842 | 141.2794217 | 255° | 36 km/h |
204451900008 | 2017/8/31 4:38 | 38.45871636 | 141.279031 | 276° | 44.1 km/h |
204451900009 | 2017/8/31 4:39 | 38.45225857 | 141.2803076 | 276° | 4.5 km/h |
204451900010 | 2017/8/31 4:40 | 38.44693848 | 141.2823128 | 272° | 42.3 km/h |
Processing Approach | Single Processing | Distributed Processing | Distributed Processing |
---|---|---|---|
Machine type | IBM server | Ordinary desktop | Ordinary desktop |
Machine physical memory | 16 GB | 2 GB | 2 GB |
Machine CPU | 2.13 GB, 4-core | 3.3 GB, 2 nuclear | 3.3 GB, 2 nuclear |
Number of machines | 1 | 4 | 8 |
Processing time | 10.5 h | 2.5 h | 2.1 h |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, L.; Sun, M.; Luo, Y.; Song, X.; Yang, C.; Hu, F.; Yu, M. Utilizing MapReduce to Improve Probe-Car Track Data Mining. ISPRS Int. J. Geo-Inf. 2018, 7, 287. https://doi.org/10.3390/ijgi7070287
Zheng L, Sun M, Luo Y, Song X, Yang C, Hu F, Yu M. Utilizing MapReduce to Improve Probe-Car Track Data Mining. ISPRS International Journal of Geo-Information. 2018; 7(7):287. https://doi.org/10.3390/ijgi7070287
Chicago/Turabian StyleZheng, Li, Meng Sun, Yuejun Luo, Xiangbo Song, Chaowei Yang, Fei Hu, and Manzhu Yu. 2018. "Utilizing MapReduce to Improve Probe-Car Track Data Mining" ISPRS International Journal of Geo-Information 7, no. 7: 287. https://doi.org/10.3390/ijgi7070287
APA StyleZheng, L., Sun, M., Luo, Y., Song, X., Yang, C., Hu, F., & Yu, M. (2018). Utilizing MapReduce to Improve Probe-Car Track Data Mining. ISPRS International Journal of Geo-Information, 7(7), 287. https://doi.org/10.3390/ijgi7070287