1 Jan 2005– 31 Dec 2007

Nyelvbányász / Language Miner

Unsupervised learning of natural languages and their utilization in office environments

A critical factor to the successful operation of today’s enterprises is to make the documents of the enterprise easily accessible to its employees. The purpose of the “Language Miner” project is to target this market by exploring a new methodology that targets the development of language models using self-organized learning processes that can learn from extremely large corpora of unannotated texts. Using this new approach the influence of ad-hoc elements present in today’s typical language models can be drastically reduced, thereby improving the performance of the new models significantly. The project integrates researchers computational linguistics, mathematics, cognitive science, physics, data mining and machine learning. This multi-disciplinary approach will open up new ways for a possible break-through in language technology. The project not only targets fundamental issues in language modeling, but also specific language technologies and applications building on the newly developed language models. The involvement of industrial partners and end-users ensures that practical end-user requirements will influence the research from the very beginnings of the project.

Participants

MTA SZTAKI (Machine learning group and Data Mining and Web Search Group)
ELTE (Department of Computer Science, Physics of Complex Systems)
BME (Math. Institute Department of Stochastic Analysis)
Research Institute for Linguistics (HAS)
MTA SZFKI
Omega Consulting
Pont Rendszerház

Manager

András Benczúr, Ph.D.

+36 1 279 6172

@email

How to find us

Press contact

Artificial Intelligence Laboratory

Computational Optical Sensing and Processing Laboratory

Department of Distributed Systems

Department of Network Security and Internet Technologies

eLearning Department

Laboratory of Parallel and Distributed Systems

Machine Perception Research Laboratory

Research Laboratory on Engineering & Management Intelligence

Systems and Control Lab

ARP opening

HUN-REN SZTAKI and MILAB to participate in AI Summit

ISTIC delegation visits HUN-REN SZTAKI

Autonomous Robotics

Autonomous Vehicles

Cloud Computing and Services

Computer Science & AI

Cyber-Physical Production Systems

Cybersecurity

Distributed Systems

Machine Perception and Interaction

Mathematical Optimization and Statistical Machine Learning

U-learning, Multimedia, Virtual Tours

National Laboratories

Research Ventures

Project and Service Highlights

Ipar 4.0 National Technology Platform Association

HUN-REN Cloud

HUN-REN Data Repository Platform

Nyelvbányász / Language Miner

Participants

Manager

András Benczúr, Ph.D.

Contact