Sidló Csaba István, Ph.D.
Kutató és informatikus mérnök vagyok több mint 10 évnyi tapasztalattal és tudással nagyméretű adatbázisok kezelése és elemzése területén.Főbb érdeklődési területeim a "big data" és "data science" alkalmazások és az ezeket kiszolgáló jól skálázható és elasztikus adatkezelő rendszerek - például IoT, mobiltelefon vagy Web adatok feldolgozása és elemzése.
Tanulmányok
-
Eötvös Loránd Egyetem, Budapest
-
2003: MSc, programtervező matematikus
-
-
2012: Informatika PhD, Információs Rendszerek program, disszertáció: Business Intelligence on Scalable Architectures
-
2002: Friedrich Schiller Universität, Jena, Németország
Kutatási területek és kiemelt publikációk
-
Business intelligence on scalable architectures: data integration and cleaning
-
Entity resolution or deduplication is my primary research area – the task of identifying and merging database records of the same real world entities is computationally hard, but required in lots of analytics scenarios.
-
Csaba István Sidló, András Garzó, András Molnár, András A. Benczúr: Infrastructures and Bounds for Distributed Entity Resolution. In Proceedings of the 9th International Workshop on Quality in Databases (QDB) in conjunction with VLDB 2011, 2011.
-
-
Large scale real-time analytics
-
Analyzing big data sets – e.g. IoT or mobile phone data – with low latency requires data streaming approaches, where achieving scalability and elasticity is still challenging.
-
Garzó A, Benczúr A A, Sidló Cs. I. , Tahara D, Wyatt E. F.: Real-time streaming mobility analytics, In: Xiaohua Hu (editor) IEEE International Conference on Big Data., IEEE, 2013. pp. 697-702., ISBN:978-1-4799-1292-6, 2013
-
-
IoT data processing, crowd-sensing
-
Viharos Zs J, Sidló Cs I, Benczúr A A, Csempesz J, Kis K B, Petrás I, Garzó A: ”Big Data” Initiative as an IT Solution for Improved Operation and Maintenance of Wind Turbines, In: European Wind Energy Association (EWEA), Bécs: pp. 184-188., 2013
-
K. Farkas, G. Fehér, A. Benczúr, Cs. Sidló: Crowdsensing Based Public Transport Information Service in Smart Cities, IEEE COMMUNICATIONS MAGAZINE 53:(8) pp. 158-165., 2015
-
Kiemelt projektek
- Extensive comparison of (IoT) data ingestion solutions, 2018
- Collecting and loading large amounts of data - produced by e.g. IoT devices - to analytical data processing platforms is a challenging task, where capabilities of traditional business intelligence ETL tools may easily become a bottleneck. We comprehensively reviewed, tested and ranked the most promising solutions to the distributed “data ingestion” big data problem: NiFi, Kafka Connect, Gobblin, Spring Integration, Streamsets, Flume and Camel / ServiceMix.
- Mobile session drop prediction models, 2017-2018
- We were building ML models predicting mobile call and network session drop on cell phones. Activities included maintaining Android data collection and test applications, managing and analyzing the data gathered data and building prediction models.
- Background data services for an IoT data market, 2016-2017
- Building a scalable background service to support storing, sharing and transforming large scale IoT data (e.g. GPS locations or home automation), applying distributed and cloud-enabled technologies like Kafka, Spark, Couchbase, with intensive testing and profiling as main priorities.
- Integration and cleaning insurance client data, 2015-2016
- Providing an integrated view of client data of a Hungarian insurance company, to start a new customer master database and service. Task required scalable methods for deduplication, data transformation and other ETL tasks.
- Network data visualization and search, 2009-
- Graph (network) data is hard to search and analyze. We develop and provide client and server side tools and methods to efficiently search, analyse and visualize network data – e.g. insurance data for fraud detection.
- Data warehousing IT audit logs and webserver logs, 2009-2011
- Collecting, processing, storing and analyzing information-access events, designing and implementing a data warehouses for an insurer and a telecom company.
Díjak
- 2003: CEEPUS ösztöndíj, Plovdiv, Bulgaria
- 2004: tudományos csapat díj, ELTE, Informatika Kar, az „Informatikai algoritmusok” könyv társszerzőségéért
- 2005: ETIK díj (Inter-University Centre for Telecommunications and Informatics)
- 2008, 2011: MTA SZTAKI intézeti díjak