GHOST Day: Applied Machine Learning Conference

Dumitru Erhan

Staff Research Scientist & TLM | Google Brain

Visual Self-supervised Learning and World Models

In order to build intelligent agents that quickly adapt to new scenes, conditions, tasks, we need to develop techniques, algorithms and models that can operate on little data or that can generalize from training data that is not similar to the test data. World Models have long been hypothesized to be a key piece in the solution to this problem. But world models are only one of the potential ways to achieve this: there is a universe of ways to use unsupervised or weakly supervised data to learn b ...

Dumitru Erhan is a Staff Research Scientist and Tech Lead Manager in the Google Brain team in San Francisco. He received a PhD from University of Montreal (MILA) in 2011 with Yoshua Bengio, where he worked on understanding deep networks. Afterwards, he has done research at the intersection of computer vision and deep learning, notably object detection (SSD), object recognition (GoogLeNet), image captioning (Show & Tell), visual question-answering, unsupervised domain adaptation (PixelDA), active perception and others. Recent work has focused on video prediction and generation, as well as its applicability to model-based reinforcement learning. He aims to build and understand agents that can learn as much as possible to self-supervised interaction with the environment, with applications to the fields of robotics and self-driving cars. Dumitru divides his free time between family, cooking and cycling through the Bay Area!

Piotr Mirowski

Staff Research Scientist | DeepMind

Skilful precipitation nowcasting using deep generative models of radar

Precipitation nowcasting, the high-resolution forecasting of precipitation up to two hours ahead, supports the real-world socioeconomic needs of many sectors reliant on weather-dependent decision-making. State-of-the-art operational nowcasting methods typically advect precipitation fields with radar-based wind estimates, and struggle to capture important nonlinear events such as convective initiations. Recently introduced deep learning methods use radar to directly predict future rain rates, fre ...

Piotr Mirowski is a Staff Research Scientist at DeepMind. He is mainly interested in reinforcement navigation-related research, in scaling up autonomous agents to real-world environments and in weather and climate modeling, but has also investigated the use of AI for artistic human and machine-based co-creation. After studying computer science in France, he obtained his Ph.D. at NYU (Outstanding Dissertation Award) under the supervision of Prof. Yann LeCun. He worked at Schlumberger Research, at the NYU Comprehensive Epilepsy Center, at Bell Labs, and Microsoft Bing on topics like epileptic seizure prediction from EEG, the inference of gene regulation networks, information retrieval and search query autocompletion, WiFi-based geolocalisation, and robotic navigation.

Ishan Misra

Research Scientist | Meta AI Research

Self-supervised learning for images, video, and 3D

Supervised learning has been the primary success story in computer vision. Pretraining on large, labeled data leads to highly transferable feature representations. In this talk, I will present self-supervised methods we developed at FAIR that can learn representations that surpass or match the quality of supervised pretrained methods. All these methods are based on the simple principle of learning representations that are invariant to visual transforms. This simple principle leads to powerful me ...

Ishan Misra finished his Ph.D. at the Robotics Institute at Carnegie Mellon University in 2018. He has since then been working as a Research Scientist at Meta AI Research (FAIR). His main research interests are Computer Vision and Unsupervised Learning, having published multiple research papers on Self-Supervised Learning and Visual Representation, together with prominent researchers like Yann LeCun and Martial Hebert. Ishan's works have won multiple awards such as the best paper award at WACV 2014 and best paper nomination at CVPR 2021. Ishan was also a guest on the Lex Fridman Podcast and ML Street Talk.

Tomas Mikolov

Senior Researcher | Czech Institute of Informatics, Robotics and Cybernetics

Complex systems for AI

In this talk, I will describe some of our recent efforts to develop mathematical models which can spontaneously evolve and increase in complexity. We hope such models can be a basis for stronger AI models, which could possibly learn, adapt and develop in time without the need for supervision or even rewards. This would allow us to solve tasks which are currently too challenging for the mainstream machine learning algorithms, such as smart chatbots or other applications where learning on the fly ...

Tomáš Mikolov, PhD is a Senior Researcher at the Czech Institute of Informatics, Robotics and Cybernetics in Prague. Before that, he conducted research at John Hopkins University, University of Montreal, and the Brno University of Technology from which he obtained PhD in 2012 for RNN-based language models. Later that year he joined Google Brain where he worked on neural networks applied to natural language processing problems such as representation learning (the word2vec project), neural language modeling and machine translation. During his work at Facebook AI Research, he co-authored fastText – library for text classification and representation learning. In 2020 he moved to Prague, to form a new research group at Czech Technical University focused on evolving mathematical models - the foundation of general AI.

Ivona Tautkute

Tech Lead and Senior AI Engineer | Tooploox

AI for creative applications & art

The progress in the generative AI methods has allowed for their applications also in the creative industries. In this talk, I will briefly overview the technology and tools used behind AI-generated content, as well as present my applications of GANs to creative photography manipulations and animation creation – a project on creating animations on my photography collections based on GANs. Furthermore, the presentation will contain examples of AI in media content creation as well as an introduct ...

Ivona is an Artificial Intelligence and Machine Learning Engineer & Researcher with a focus on Computer Vision. She holds a Master's and Bachelor's degree in Mathematics from the University of Warsaw and is currently pursuing a Ph.D. in Computer Science at the Polish-Japanese Academy of Information Technology. The topic of her doctoral thesis is """"Artificial neural networks for multimodal data embeddings and classification"""". Currently, Ivona works as a Tech Lead and Senior AI Engineer at a software company Tooploox. Her area of expertise involves image recognition, object detection, object segmentation, 3D data analysis, time series prediction, and generative adversarial networks. She has published her research at major AI/ML conferences and journals, such as CVPR, IEEE Access, ICONIP and has presented her work at technology conferences in Los Angeles, New York, Lausanne, Prague, Salt Lake City and more. Ivona is also a prominent AI artist, working with photography and GANs. Her artworks are exhibited at galleries worldwide and were recently sold at a prestigious auction house Sotheby's in New York.

Wojciech Rosinski

CTO | ReSpo.Vision

Methods for efficient management and deployment of complex deep learning systems: a SportsTech use case

Running complex machine learning workflows at scale requires a set of specialized tools covering different stages of the process. Methods of flexible pipeline parametrization, robust experiment tracking and job scheduling are among key ingredients to ensure that the workflow is easily manageable. I will describe the architecture and workflow that we use for managing our system enabling 3D data extraction from single-camera sports videos. The system combines multiple deep learning and machine le ...

Wojciech Rosinski is CTO in ReSpo.Vision, a SportTech startup aiming to revolutionize football analytics by leveraging the latest, cutting-edge AI research to watch & analyze football games providing players, scouts, managers, clubs and federations with unmatched depth of knowledge. He has extensive experience in both R&D and industry projects, where he was working on diverse projects spanning computer vision, natural language processing and tabular data. He is a Kaggle Master with 2 gold medals and multiple high finishes.

Andrew Breen

Senior Manager | Amazon TTS Research

A brief history of Neural TTS

For many years “concatenative” speech synthesis was the industrial standard for text to speech technology. It provided relative high-quality (and in limited domains very high-quality) synthetic audio sufficient for wide spread commercial use. However, it had its limits. It required large amounts of pre-recorded audio spoken by professional voice talents, and recordings were constrained to a narrow range of expressivity. These constraints meant that many applications, which required expressi ...

Andrew has a B.Sc. Hons, in Physics with Computing Physics from University College Swansea, an M.Sc. (Eng.) by research from Liverpool University, and a Ph.D. in Speech Science from University College London. Andrew worked at BT Labs. on ASR, and lead teams on TTS, Avatars and multi-modal distributed systems. in 1999 he joined the University of East Anglia as a Sr. Lecturer. He join Nuance in 2001 as founder for their TTS organisation, eventually becoming Director of TTS Research, and Product Development in India and China. In 2017 he joined Amazon as the Sr. Manager for research in Amazon’s TTS organization.

Mateusz Półtorak

Senior Data Scientist | Pearson

Semantic information extraction

Semantic information extraction is a hot topic (as the whole NLP). Semantic extraction is applied in a variety of products, such as automated assistants, chatbots, or intelligent tutoring systems. This talk will focus on natural language processing techniques that are designed to extract semantic information from text input. During the talk we will spot the differences between paraphrase detection and natural language inference. This will allow us to define which use cases can benefit from these ...

Mateusz's great passion is modeling and mathematical approximation of complex mechanisms that rule our world. For over two years now, his work has been focusing mainly on natural language processing. Currently, he focuses on merging two worlds: world of NLP along with real word use cases, such as automatic assessment of English open-ended language tests. Privately, he worships all kinds of art.

Adam Kupryjanow

AI Applied Research Sientist | Intel

How AI boosted Audio Processing?

Historically all of audio signal processing was done using traditional digital signal processing techniques (DSP). Two main issues with neural networks were related with compute complexity and quality of processing audio signals. In 2016 neural network called wave-net was designed. This was the first network that allows to perform high quality noise reduction, but its compute complexity was to high. It was impossible to run it in a real-time on regular PC. Then in 2017 research was moved one ste ...

Adam is AI Applied Research Scientist in the Intel Audio team in Gdansk, Poland. He received a Ph.D from Gdansk University of Technology (GUT) in 2013. He worked on methods for speech intelligibility improvement. At Intel he was involved in many projects devoted to speech and voice processing. He developed algorithms and IP blocks like beamforming, reverb reduction, noise reduction or automatic gain control. That solutions are used in client products (laptops) and IOT (smart speakers, smart fridges, smart microwaves, kitchen robots).