GHOST Day: Applied Machine Learning Conference

Speakers

Visual Self-supervised Learning and World Models

by Dumitru Erhan | Staff Research Scientist & TLM | Google Brain

Biography:
Dumitru Erhan is a Staff Research Scientist and Tech Lead Manager in the Google Brain team in San Francisco. He received a PhD from University of Montreal (MILA) in 2011 with Yoshua Bengio, where he worked on understanding deep networks. Afterwards, he has done research at the intersection of computer vision and deep learning, notably object detection (SSD), object recognition (GoogLeNet), image captioning (Show & Tell), visual question-answering, unsupervised domain adaptation (PixelDA), active perception and others. Recent work has focused on video prediction and generation, as well as its applicability to model-based reinforcement learning. He aims to build and understand agents that can learn as much as possible to self-supervised interaction with the environment, with applications to the fields of robotics and self-driving cars. Dumitru divides his free time between family, cooking and cycling through the Bay Area!

Abstract:
In order to build intelligent agents that quickly adapt to new scenes, conditions, tasks, we need to develop techniques, algorithms and models that can operate on little data or that can generalize from training data that is not similar to the test data. World Models have long been hypothesized to be a key piece in the solution to this problem. But world models are only one of the potential ways to achieve this: there is a universe of ways to use unsupervised or weakly supervised data to learn better reusable representations for our problems. In this talk, I will describe a number of recent advances for modeling and generating image and video observations. These approaches can help with building agents that interact with the environment and mitigate the sample complexity problems in reinforcement learning, but also make supervised learning easier with few labeled examples. Such approaches can also enable agents that generalize quicker to new scenarios, tasks, objects and situations and are thus more robust to environment changes. Finally, I will have some speculative thoughts on compositional generalization and why I believe it's the natural next big challenge to explore in machine learning.

Skilful precipitation nowcasting using deep generative models of radar

by Piotr Mirowski | Staff Research Scientist | DeepMind

Show info

Biography:
Piotr Mirowski is a Staff Research Scientist at DeepMind. He is mainly interested in reinforcement navigation-related research, in scaling up autonomous agents to real-world environments and in weather and climate modeling, but has also investigated the use of AI for artistic human and machine-based co-creation. After studying computer science in France, he obtained his Ph.D. at NYU (Outstanding Dissertation Award) under the supervision of Prof. Yann LeCun. He worked at Schlumberger Research, at the NYU Comprehensive Epilepsy Center, at Bell Labs, and Microsoft Bing on topics like epileptic seizure prediction from EEG, the inference of gene regulation networks, information retrieval and search query autocompletion, WiFi-based geolocalisation, and robotic navigation.

Abstract:
Precipitation nowcasting, the high-resolution forecasting of precipitation up to two hours ahead, supports the real-world socioeconomic needs of many sectors reliant on weather-dependent decision-making. State-of-the-art operational nowcasting methods typically advect precipitation fields with radar-based wind estimates, and struggle to capture important nonlinear events such as convective initiations. Recently introduced deep learning methods use radar to directly predict future rain rates, free of physical constraints. While they accurately predict low-intensity rainfall, their operational utility is limited because their lack of constraints produces blurry nowcasts at longer lead times, yielding poor performance on rarer medium-to-heavy rain events. Here we present a deep generative model for the probabilistic nowcasting of precipitation from radar that addresses these challenges. Using statistical, economic and cognitive measures, we show that our method provides improved forecast quality, forecast consistency and forecast value. Our model produces realistic and spatiotemporally consistent predictions over regions up to 1,536 km × 1,280 km and with lead times from 5–90 min ahead. Using a systematic evaluation by more than 50 expert meteorologists, we show that our generative model ranked first for its accuracy and usefulness in 89% of cases against two competitive methods. When verified quantitatively, these nowcasts are skillful without resorting to blurring. We show that generative nowcasting can provide probabilistic predictions that improve forecast value and support operational utility, and at resolutions and lead times where alternative methods struggle.

Self-supervised learning for images, video, and 3D

by Ishan Misra | Research Scientist | Meta AI Research

Show info

Biography:
Ishan Misra finished his Ph.D. at the Robotics Institute at Carnegie Mellon University in 2018. He has since then been working as a Research Scientist at Meta AI Research (FAIR). His main research interests are Computer Vision and Unsupervised Learning, having published multiple research papers on Self-Supervised Learning and Visual Representation, together with prominent researchers like Yann LeCun and Martial Hebert. Ishan's works have won multiple awards such as the best paper award at WACV 2014 and best paper nomination at CVPR 2021. Ishan was also a guest on the Lex Fridman Podcast and ML Street Talk.

Abstract:
Supervised learning has been the primary success story in computer vision. Pretraining on large, labeled data leads to highly transferable feature representations. In this talk, I will present self-supervised methods we developed at FAIR that can learn representations that surpass or match the quality of supervised pretrained methods. All these methods are based on the simple principle of learning representations that are invariant to visual transforms. This simple principle leads to powerful methods that can be easily applied to image, video, and 3D data, and can leverage large amounts of unlabeled data.The resulting self-supervised models can be used via transfer learning to create state-of-the-art object detectors, action recognition and 3D recognition models. Self-supervised pretraining leads to more robust representations and can also help with `tail’ classes in recognition. Beyond transfer learning, I will show how self-supervised methods can discover objects - discover pixels that can group together by using just image or audio signals.

Complex systems for AI

by Tomas Mikolov | Senior Researcher | Czech Institute of Informatics, Robotics and Cybernetics

Show info

Biography:
Tomáš Mikolov, PhD is a Senior Researcher at the Czech Institute of Informatics, Robotics and Cybernetics in Prague. Before that, he conducted research at John Hopkins University, University of Montreal, and the Brno University of Technology from which he obtained PhD in 2012 for RNN-based language models. Later that year he joined Google Brain where he worked on neural networks applied to natural language processing problems such as representation learning (the word2vec project), neural language modeling and machine translation. During his work at Facebook AI Research, he co-authored fastText – library for text classification and representation learning. In 2020 he moved to Prague, to form a new research group at Czech Technical University focused on evolving mathematical models - the foundation of general AI.

Abstract:
In this talk, I will describe some of our recent efforts to develop mathematical models which can spontaneously evolve and increase in complexity. We hope such models can be a basis for stronger AI models, which could possibly learn, adapt and develop in time without the need for supervision or even rewards. This would allow us to solve tasks which are currently too challenging for the mainstream machine learning algorithms, such as smart chatbots or other applications where learning on the fly without supervision is necessary.

Optimizing training datasets for expressive text-to-speech synthesis

by Monika Podsiadło | Head of Applied Research: Text-To-Speech | Google NYC

Show info

Biography:
Monika Podsiadło leads Text to Speech Applied Research at Google New York with over 10 years of experience in the field. Her work is centered around managing a team focused on prosody, few-shot learning, and cross-lingual modeling. Before that, together with her team, she launched over 200 TTS voices in 30 languages, productionized WaveNet, and expanded Google Assistant. In 2007 Monika graduated from The University of Edinburgh, defending her master's thesis on Speech and Language Processing. After hours she mentors at BUILD, helping 9th graders to launch a start-up.

Abstract:
One of the main problems in Machine Learning is the data we use to train the models. As the complexity of machine learning models increased, so did the size of the training datasets. In addition, a variety of “human-in-the-loop” approaches and data annotation requirements became standard practices in developing production-level systems. This data bottleneck is a particularly important consideration for text-to-speech [TTS] systems: the high-quality data that such systems rely on, including text but especially audio, is expensive and time-consuming to curate, and a true blocker to scaling TTS voice development. In this talk, I will examine the interplay between the training data and the synthetic voice quality using a number of state-of-the-art TTS backends as examples. I will show how big data is not always the best data, and I will go over some practical techniques on how to drive synthesis quality with intelligent dataset design to build ultra-natural and expressive TTS.

Linguistic markers predict onset of Alzheimer

by Elif Eyigoz | Researcher | IBM Watson

Show info

Biography:
Elif Eyigoz, PhD works as a Research Staff Member at IBM Watson NY. She is a member of the Healthcare and Lifesciences research group. Elif has an exceptional multi-disciplinary background in Philosophy (BA), Cognitive Science (MA), Linguistics (MA), and Computer Science (MS and PhD). She joined IBM in 2014. In 2020 she co-authored a study on linguistic markers of Alzheimer’s disease where machine learning is used to detect early signs of progression of the illness.

Abstract:
The aim of this study is to use classification methods to predict future onset of Alzheimer's disease in cognitively normal subjects through automated linguistic analysis. To study linguistic performance as an early biomarker of AD, we performed predictive modeling of future diagnosis of AD from a cognitively normal baseline of Framingham Heart Study participants. The linguistic variables were derived from written responses to the cookie-theft picture-description task. We compared the predictive performance of linguistic variables with clinical and neuropsychological variables. The study included 703 samples from 270 participants out of which a dataset consisting of a single sample from 80 participants was held out for testing. Half of the participants in the test set developed AD symptoms before 85 years old, while the other half did not. All samples in the test set were collected during the cognitively normal period (before MCI). The mean time to diagnosis of mild AD was 7.59 years. Significant predictive power was obtained, with AUC of 0.74 and accuracy of 0.70 when using linguistic variables. The linguistic variables most relevant for predicting onset of AD have been identified in the literature as associated with cognitive decline in dementia. The results suggest that language performance in naturalistic probes expose subtle early signs of progression to AD in advance of clinical diagnosis of impairment.

Teaching Machine Learning to Children

by David Touretzky | Research Professor | Carnegie Mellon University

Show info

Biography:
David S. Touretzky is a Research Professor in the Computer Science Department and Neuroscience Institute at Carnegie Mellon University. He received his PhD in Computer Science from Carnegie Mellon in 1984. Dr. Touretzky's research interests include Cognitive Robotics, Computational Neuroscience, and Computer Science Education. He is founder and chair of the AI4K12 Initiative (AI4K12.org), which is developing national guidelines for teaching Artificial Intelligence in K-12. He is also the creator of Calypso, an inteligent robot programming framework that puts real artificial intelligence technology into the hands of children.

Abstract:
The AI4K12 Initiative (AI4K12.org), which I chair, is developing national guidelines for teaching artificial intelligence in K-12. Early in the project we published a list of Five Big Ideas in AI, with number 3 being machine learning, i.e., "Computers can learn from data." We have since released draft grade band progression charts for the first three big ideas; the remaining two will be released soon. Each chart describes, for each of four grade bands (K-2, 3-5, 6-8, and 9-12), what children should know about that big idea and what they should be able to do with it. For machine learning, these achievements are made possible in part by marvelous new software tools developed specifically for the K-12 audience, such as Google's Teachable Machine, Dale Lane's Machine Learning for Kids, and Code.org's AI Lab. In this talk I will highlight these tools, review our grade band progression chart for machine learning, and describe the machine learning insights and skills children can acquire.

Tweet-Topic Classification: The Real-Life Perspective

by Mateusz Fedoryszak | Data Scientist | Twitter

Show info

Biography:
Mateusz is an ML Engineer at Twitter. He works on understanding what people are tweeting about: currently by the topic classification, earlier by the automated event detection. Before that he was analysing terabytes of scientific papers, looking for word boundaries in scriptio continua and leveraged data to guess how much milk a cow would produce. Big fan of logistic regression, kNN and pretty charts. Formerly at ICM University of Warsaw, Microsoft and True Knowledge (now Amazon).

Abstract:
In this talk I'm going to present an ML-driven approach at topic classification that we use at Twitter. Although I'll share some details regarding the modelling, my focus would be on challenges of deploying and maintaining a model in production. To begin with, I’ll talk about the project background and requirements that have shaped our design choices. Among them: legacy systems in place and the data characteristics. I’ll cover a BERT-based model architecture that we’ve designed along with evaluation methodology. I’ll describe how our classifier is deployed and how it interplays with other systems. The problem of data gathering and bias will be of a particular importance. I’ll describe problems with recall that we’ve faced and active learning-inspired steps that we’ve taken to overcome them. Finally, I’ll describe continuous training and deployment pipeline that we’ve designed to ensure model freshness. To sum up, I’ll present the whole process that was needed to build an ML-system and ensure its performing well in an production environment.

Bayesian Optimization with Categorical and Continuous Variables

by Vu Nguyen | Machine Learning Scientist | Amazon

Show info

Biography:
Dr Vu Nguyen is a Machine Learning Scientist at Amazon Research Australia. His research interest includes Bayesian optimisation for optimal decision making under uncertainty. Prior to this appointment, he was a Senior Research Associate in machine learning at University of Oxford working with Prof. Michael Osborne and Prof. Andrew Briggs. Other prior roles include Research Scientist at a research start-up CreditAI and an Associate Research Fellow at Deakin University. Dr Nguyen obtained his PhD at Deakin University in 2015, where he was fortunate to have Professors Dinh Phung and Svetha Venkatesh as his advisors.

Abstract:
Bayesian optimization (BO) has demonstrated impressive success in optimizing black-box functions. However, there are still challenges in dealing with black-boxes that include both continuous and categorical inputs. I am going to present our recent works in optimizing the mixed space of categorical and continuous variables using Bayesian optimization [1] and how to scale it up to higher dimensions [2] and population-based AutoRL setting [3]. The talk is based on the following research: [1] B. Ru, A. Alvi, V. Nguyen, M. Osborne, and S. Roberts. ""Bayesian optimisation over multiple continuous and categorical inputs."" ICML 2020. [2] X. Wan, V. Nguyen, H. Ha, B. Ru, C. Lu, and M. Osborne. ""Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces."" ICML 2021 [3] J. Parker-Holder, V. Nguyen, S. Desai, and S. Roberts. ""Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL"". NeurIPS 2021

Towards in-silico drug design

by Marta Stępniewska-Dziubińska | Software Engineer | NVIDIA

Show info

Biography:
Marta is a Software Engineer at NVIDIA, working on deep learning models for computer vision and drug discovery. She started working with ML during her PhD, for which she built deep neural networks for structure-based drug discovery. Afterwards she decided to turn to industry and used her skill-set for computer vision problems. She was involved in projects aimed at digitizing industry installation documentation, analyzing 3D point clouds and distance maps, and extracting useful information from surveillance videos. Now she is getting back to her roots, working again on deep learning models for rational drug discovery.

Abstract:
Developing a new drug is a complex and expensive process that requires years of experimentation. Computational methods have been widely used to replace experiments with simulations and predictive modeling, and reduce the number of molecules that need to be tested in the laboratory. This allows researchers to focus on the most promising drug candidates and discard early those that can possess unwanted physico-chemical properties, have low activity or high risk of adverse reactions. In recent years deep learning has taken the field by storm, providing new solutions for a vast repertoire of problems related to the drug discovery process. Deep neural networks were proven more effective than classical models for tasks like predicting protein structures, properties of small molecules, or protein-ligand interactions. In this presentation I will lay out a landscape of deep learning methods used for drug discovery, providing examples of models for different stages of the drug design pipeline. I will showcase the most recent achievements and actively developing fields, but also present how models developed for other domains can be adapted for bio- and cheminformatics.

Embracing the Range of Data Science

by Jev Gamper | Staff Decision Scientist | Experimentation & Causal Inference | Vinted

Show info

Biography:
Jev Gamper is a Staff Decision Scientist at Vinted. His role involves leadership across all aspects of experimentation and causal inference, from implementing scientific models underlying the experimentation system to setting up technical roadmap and establishing experimentation culture. Jev did his Msc in Applied Mathematics at Warwick Univeristy, and is a PhD Candidate at Warwick University. His research invovled applications of statistical and machine learning methods to medical imaging, astronomy, remote sensing, and climate modeling. Jev's research articles have been published at venues like CVPR, and Monthly Notices of the Royal Astronomical Society. He is a board member of Lithuanian AI Association and a co-organiser of Eastern European Machine Learning Summer School in 2022, in Vilnius.

Abstract:
Okay, you know how to use every algorithm available on Sklearn; you know SQL; and you know how to use Jax or Pytorch. So, what is the next thing you should learn as a data scientist? In this talk, let's attempt to rediscover the full range of tools in the toolbox and move beyond supervised learning! We will attempt to review a full map of data science, and what does it mean for your ability to formulate problems and have fun solving them - in any domain of application.

How to build a video classification system when you can't rely on visual features

by Karol Żak | Senior Data & Applied Scientist | Microsoft

Show info

Biography:
Karol Żak is a self-taught Data & Applied Scientist with a strong Software Engineering background. For the last 5 years he worked in Commercial Software Engineering group at Microsoft where he collaborated with some of the biggest organizations worldwide to build ML/DS solutions for their most pressing business problems. His main area of interest is computer vision but throughout the years he worked in a full spectrum of different ML areas.

Abstract:
Karol will talk about a recent project he and his team worked on for one of the top Microsofts customers from media industry. For this project he helped to build a video classification system to aid the manual process of reviewing and categorizing video ads. What initially seemed like a "simple" computer vision task turned out to be a much more complex problem which had to be tackled from a different angle. In his presentation he will walkthrough the problem his team stumbled upon and the final approach they used to solve it.

How to detect silent model failure

by Wojtek Kuberski | Data Scientist | Co-Founder | NannyML

Show info

Biography:
Wojtek Kuberski is a co-founder of NannyML, a startup for monitoring ML models in production. He holds a Master's Degree in AI. He previously founded and grew an AI consultancy. He likes tennis, chess, and food.

Abstract:
AI Algorithms deteriorate and fail silently over time impacting the business’ bottom line. The talk is focused on learning how you should be monitoring ML in production. It is a conceptual and informative talk addressed to Data Scientists & Machine Learning Engineers. We'll learn about the types of failures, how to detect and address them.

How artificial intelligence and deep learning speed up synthesis planning in drug discovery

by Stanisław Jastrzębski | Chief Scientific Officer | Molecule one

Show info

Biography:
Stanisław Jastrzębski serves as a Chief Scientific Officer at Molecule one, where he is helping develop AI for organic chemistry. Prior to that, he was an Assistant Professor at Jagiellonian University (as part of GMUM.net), and a postdoc at New York University. He completed his PhD at Jagiellonian University, advised by Jacek Tabor and Amos Storkey from the University of Edinburgh. His thesis was focused on fundamental aspects of deep learning and was largely based on work done in collaboration with Yoshua Bengio at MILA. He is actively contributing to the machine learning community as an area chair for leading conferences (NeurIPS, ICLR, ICML). His long-term interest is to develop AI and deep learning for discovering novel scientific knowledge.

Abstract:
Applications of artificial intelligence and deep learning to drug discovery are on the rise. Naturally, a key task in drug discovery is planning how to synthesize (make in the lab) a chemical molecule of interest. In this talk, I will describe some of the most interesting techniques that are commercially used to automatically and rapidly plan the synthesis of potential drugs, especially at Molecule.one. I will also share some broadly applicable lessons about the surprising difficulty of deploying robust AI systems. On the whole, this talk aims to be a gentle introduction to this fascinating application area, which will hopefully inspire you to look closer into the broader field of AI for drug discovery.

Data Privacy - building ML solutions for your customers without looking at the data

by Tomasz Marciniak | Software Development Engineer | Microsoft

Show info

Biography:
Tomasz Marciniak, software development engineer at Microsoft, working on web search relevance at Bing since 2009 and on enterprise search since 2018. He specializes in develeopment of ML techniques for large-scale search applications. Previously he worked as applied research engineer at Language Computer Corporation, with focus on language generation for Question Answering and at Yahoo! on applying ML & NLP to web search and search marketing. Education: Cognitive Linguistics at University of Marie Curie Sklodowska in Lublin, Computational Linguistics at Heidelberg University.

Abstract:
Building Machine Learning solutions for external customers such as businesses or public institutions adds a level of complexity often missing in internal or research applications. The data used to train ML models is often confidential and should be fully owned and managed only by the customers. Revealing it in any form even to the researches or engineers who develop the system might pose serious financial or security threats. It thus generates challenges during the entire ML development cycle from designing and building systems for collecting and storing the data, through implementing the pipelines for data preprocessing, such as labeling or featurization and finally training and deploying the models. We will discuss this topic of data privacy on the example of Microsoft Search, an enterprise search engine and will contrast it with the related concepts of data security and anonymity.

Fast Synthetic Graph Generators for Graph Neural Networks

by Piotr Bigaj | Deep Learning Algorithms Manager | NVIDIA Poland

Show info

Biography:
Piotr is a deep learning algorithms manager at NVIDIA. He holds a PhD in automatics and robotics from the Polish Academy of Sciences. Previously to NVIDIA he has worked for Samsung R&D Poland as a Principal Software Engineer in AI Division where he led Big Data and Recommendation Systems Team delivering large scale RecSys models. At NVIDIA, his work is concentrated on deep learning algorithms for tabular data, including recommender systems and recently graph neural networks.

Abstract:
Graph Neural Networks is a class of neural networks that process the data represented by graph structures. They can solve tasks from different data domains like computer vision, NLP, recommender systems and others. The most common way of training GNNs is to perform edge sampling. With edge sampling due to sparse structure of the graph the amount of computation for each batch is different making the model performance vary during the training. To simulate performance of GNN models, we need to have good synthetic graph generation that can mimic original large graph properties. Unfortunately Graphs the generators are not that common. This talk will show how fast graph generators based on Stochastic Kronecker Matrix Product work, how they relate to famous Erdős–Rényi model, how to fit such generative model and whether one can use simple generator to produce graph of good characteristics.

Learning Representations for Hotel Ranking

by Ioannis Partalas | Principal Machine Learning Scientist | Expedia Group

Show info

Biography:
Ioannis Partalas works as Principal Machine Learning Scientist at Expedia Group. His current focus is learning representation in the context of recommendation and ranking systems. Previously he worked as a Research Scientist in Viseo Group, France, on Natural Language Processing building scalable approaches for various tasks such as text classification, named-entity recognition and opinion mining. Before that he was an associate researcher in Grenoble-Alpes University working on large-scale/extreme classification systems.

Abstract:
In this talk I will present work on learning item representations from user click-sessions in the hospitality domain and more specifically from the Expedia Group online platforms. I will present in details the proposed neural architecture that leverages side information of items, like attributes and geographic information, in order to learn a joint embedding. I will also explain how it addresses the cold-start problem which is typical in recommendation systems. Results in a downstream task show that including such structured information improves predictive performance. Finally, I will show through the results of on-line controlled tests that the model generates high quality representations that boosts the performance of a hotel recommendation system on Expedia travel platform.

Deep Learning for Automated Audio Captioning

by Wenwu Wang | Signal Processing & ML Professor | University of Surrey

Show info

Biography:
Wenwu Wang is a Professor in Signal Processing and Machine Learning, and a CoDirector of the Machine Audition Lab within the Centre for Vision Speech and Signal Processing. His current research interests include signal processing, machine learning and perception, with a focus on audio/speech and multimodal (e.g. audio-visual) data. He has (co)-authored over 250 papers in these areas. He is a (co-)recipient of over 15 awards including the Best Paper Award on ICAUS 2021, Judge's Award on DCASE 2020, the Reproducible System Award on DCASE 2019 and 2020, Best Student Paper Award on LVA/ICA 2018, the Best Oral Presentation on FSDM 2016, Best Student Paper Award finalists on ICASSP 2019 and LVA/ICA 2010, the TVB Europe Award for Best Achievement in Sound in 2016, the Best Solution Award on the Dstl Challenge in 2012, and the 1st place in 2020 DCASE challenge on "Urban Sound Tagging with Spatio-Temporal Context", and the 1st place in the 2017 DCASE Challenge on "Large-scale Weakly Supervised Sound Event Detection for Smart Cars". He is a Senior Area Editor (2019-) for IEEE Transactions on Signal Processing, an Associate Editor (2020-) for IEEE/ACM Transactions on Audio Speech and Language Processing, and an Associate Editor (2019-) for EURASIP Journal on Audio Speech and Music Processing. He is a Specialty Editor in Chief (2021-) of Frontier in Signal Processing, and was an Associate Editor (2014-2018) for IEEE Transactions on Signal Processing. He is elected to Vice Chair (2022-) of IEEE Machine Learning for Signal Processing Technical Committee, a Member (2021-) of the IEEE Signal Processing Theory and Methods Technical Committee, and a Member (2019-) of the International Steering Committee of Latent Variable Analysis and Signal Separation. He was a Publication Co-Chair for ICASSP 2019, Brighton, UK. He is a Satellite Workshop Co-Chair for INTERSPEECH 2022, Incheon, Korea.

Abstract:
Automated audio captioning (AAC) aims to describe an audio clip using natural language and is a cross-modal translation task at the intersection of audio processing and natural language processing. Generating a meaningful description for an audio clip not only needs to determine what audio events are presented, but also needs to capture and express their spatial-temporal relationships. Automated audio captioning is useful in applications such as assisting the hearing-impaired to understand environmental sounds, facilitating retrieval of multimedia content, and analyzing sounds for security surveillance. However, current audio captioning systems are faced with several challenges, for example, lack of labelled training data, lack of diversity in the generated text descriptions, and lack of effective cross-modal representation of audio and texts. In this talk, we will present our ongoing progress in designing new learning techniques for improving the accuracy and diversity of audio captioning systems, for example, using contrastive learning to improve audio representation and audio-text alignment, using transformer based methods to capture global information within an audio signal and temporal relationships between audio events, and using adversarial learning to generate diverse language descriptions. We show the competitive performance of our systems on the datasets such as AudioCaps, Clotho and DCASE, using performance metrics such as BLEU, ROUGEL, METERO, CIDEr, SPICE and SPIDEr.

Effective ML system development

by Leonard Aukea | Driving Machine Learning Engineering & Operations | Volvo Cars

Show info

Biography:
Leonard is driving ML Engineering and Operations at Volvo Cars. He is responsible for defining the overall mission and strategy for ML Engineering and Operations, leading the build of reproducible ML systems. Leonard Aukea has spent most of his career as a Data Scientist/ML Engineer.

Abstract:
In order to efficiently deliver and maintain ML systems; the adoption of MLOps practices is a must. In recent times, the ML community have had to embrace and modify ideas originating from software engineering with reasonable success. Software 2.0 (AI/ML) poses some additional challenges that we are still struggling with today. In addition to code; data and models also abide by the continuous principles (Continuous Integration, Delivery and Training). At Volvo Cars, we are embracing a git-cetric, declarative approach to ML experimentation and delivery. The adoption of MLOps principles requires cultural transformation alongside supportive infrastructure & tooling that enables efficient development throughout the ML lifecycle. Join us for this session to learn about how Volvo Cars embraces MLOps.

The Consciousness of AI

by Smriti Mishra | Head of AI | Earthbanc

Show info

Biography:
Smriti Mishra is a Head of Artificial Intelligence at Earthbanc where she implements techniques for remote sensing for verification of carbon sequestration. Hers Bachelor's Thesis was in realm of Computational Neuroscience. It was a study about overlapping sequences in the brain, how thoughts are processed and why people lose memory. Smriti is a Founding Member of AI Guild, which is the go-to community for data and business professionals advancing AI adoption. She is also “Google Women Techmakers” Ambassador in Sweden

Abstract:
Have you ever been intrigued about the human mind, how do human beings think and process memory or why do people lose memory sometimes? Ever wondered how can artificial intelligence be used to understand complex cognitive mechanisms? Or, how can AI and deep learning be used to improve mental health? During this session, I will walk you through the several different types of patterns inside the human brain using Bayesian Confidence Propagation Neural Networks and sequence learning in a non-spiking attractor neural network. I will also discuss the two primary types of overlaps that occur in the human brain and how they affect our memory and cognitive abilities. I will talk about how we can use recurrent neural networks and sequence learning to study memory loss in diseases like Amnesia and Dementia. This brain-inspired neural network model will also be able to encode and reproduce temporal aspects of the input, and offers internal control of the recall dynamics by gain modulation. Furthermore, I will comprehensively discuss how technology and artificial intelligence can be employed for mental wellness and also, in removing the stigma around mental health.

Harmonic Analysis: A Complex Classification Problem

by Gianluca Micchi | Machine Learning Researcher | IRIS Audio Technologies

Show info

Biography:
Gianluca Micchi has a double education: a diploma in music (piano) and a PhD in theoretical physics (on nano-electromechanical systems); so he decided to become a computer scientist. As a machine learning researcher, he has worked both in academia (postdoc at the University of Lille) and in private companies (Skylads, TikTok, and now IRIS audio technologies). His main area of interest nowadays is AI applied to music and audio. He was part of the team that secured fourth place at the first edition of the AI Song contest.

Abstract:
Automatic harmonic analysis has been an enduring focus of the Music Information Retrieval community and has enjoyed a particularly vigorous revival of interest in the machine-learning age. At heart, this is a classification problem with one target per time step and can therefore be solved with a Convolutional Recurrent Neural Network (CRNN) or a Transformer. However, it is complicated by the small amount of available training data: We have harmonic annotations for only a few hundreds of pieces but the naive approach of writing one output class for each possible chord leads to ~10 million classes due to combinatorial explosion. Data augmentation is typically used but is not enough. One can also reduce the number of output classes by several orders of magnitude by treating each label (e.g. key or chord quality) independently, however this has been shown to lead to incoherent output labels. To solve this issue we use a modified Neural Autoregressive Distribution Estimation (NADE) as the last layer of a CRNN. The NADE layer ensures that labels related to the same chord are dependently predicted, therefore enforcing coherence.

Doctor-in-the-loop: Interactive Machine Learning in Healthcare AI

by Rachel Wities | NLP Researcher | Zebra Medical Vision

Show info

Biography:
Rachel Wities is an NLP Researcher, focused on the Healthcare domain. In her previous roles she was an NLP Data Scientist in Zebra Medical and a Research Scientist in PayPal. Rachel is a public speaker addressing Healthcare NLP challenges, and believes that understanding doctors and their needs is the key to successfully implementing AI Healthcare algorithms. Rachel holds an M.Sc. from BIU NLP lab, researching knowledge graph representation of text semantics, and a B.Sc in Physics and Cognitive Science from Hebrew University in Jerusalem. Loves her family, God and Oxford Comma jokes.

Abstract:
Working in a Healthcare startup, one of my most frustrating experiences was to ask doctors to do tedious work of data annotation or result verification. Surely there’s a better way, I told myself, to exploit the knowledge and expertise of doctors, than to turn them into a labeling conveyor belt! Well, it turns out there is. Human-in-the-loop ML refers to human-machine interaction in data annotation and model training. In Zebra Medical we used Human-in-the-loop techniques to compensate for lack of tagged data and to better exploit clinical expert knowledge. In this lecture I will show how to make data annotation quicker and smarter by turning it into an interactive process, and how an interactive process of experts and models writing rules together can improve your model performance without additional training. This talk is intended for AI researchers interested in better ways to exploit the knowledge and experience of domain experts, and for people interested in the challenges of AI in the Healthcare domain.

Causal discovery in Python

by Aleksander Molak | ML Engineer | ML Researcher | Ironscales & Tensorcell

Show info

Biography:
Aleksander Molak is a Machine Learning Engineer and Researcher at Ironscales and Machine Learning Researcher at Tensorcell. He's the author of #SundayAiPapers - a weekly LinkedIn microblog presenting the most recent papers on natural language processing, causal inference and probabilistic modeling. Aleksander loves traveling together with his wife, he's passionate about vegan food, languages and running.

Abstract:
Over the last decade, causal inference gained a lot of traction in academia and in the industry. Causal models can be immensely helpful in various areas – from marketing to medicine and from finance to cybersecurity. To make these models work, we need not only data as in traditional machine learning, but also a causal structure. Traditional way to obtain the latter is through well-designed experiments. Unfortunately, experiments can be tricky – difficult to design, expensive or unethical. Causal discovery (also known as structure learning) is an umbrella term that describes several families of methods aiming at discovering causal structure from observational (non-experimental) data. During the talk, we will review the basics of causal inference and introduce the concept of causal discovery. Next, we will discuss differences between various approaches to causal discovery. Finally, we will see a series of practical examples of causal discovery using Python.

Probabilistic programming, why we need it in business settings

by Luciano Paz | Principal Data Scientist | PyMC-Labs

Show info

Biography:
Luciano Paz is a Principal Data Scientist at PyMC-Labs. He studied physics and then transitioned into neuroscience, where his research involved computational techniques such as reinforcement learning, planning, stochastic dynamics and probabilistic programming. He is a core developer of PyMC, a probabilistic programming language in Python. His work in PyMC-Labs is to help companies power their business using bayesian statistics. The projects he has worked on range from marketing studies, such as A/B tests and Media Mix Models, to complex behavioral models.

Abstract:
All businesses have to make decisions based upon the noisy data that they have available. They might have to choose which advertisement campaign to use, how to distribute their marketing budget, or whether their product is effective in the market. It’s often the case, that the available data is scarce, structured and highly unbalanced, and it needs expert knowledge to piece together the little pieces of information to be able to make decent decisions.I’ll talk to you about why probabilistic programming is a good way to address these kinds of business scenarios, and how we at PyMC Labs use probabilistic programming languages (PPLs) to help companies to make better decisions from their data.

How can we learn the structure of customer service data @Allegro?

by Aleksandra Chrabrowa | NLP/ML Research Engineer | Allegro

Show info

Biography:
Ola Chrabrowa is Machine Learning Research Engineer @Allegro. She works with NLP (textual data). She has a couple of years of experience in working in ML/NLP field and a background in physics.

Abstract:
@Allegro, we automate customer service. For good automation with ML algorithms, one needs to know the structure of the data - the precise taxonomy of user questions to customer services (so called intents). We perform open-world intent discovery with multiple data sources and leverage pretraining in-domain text encoders to our advantage. We review popular training schemes for clustering and find that sometimes their performance on real business data is exactly opposite compared to public benchmark datasets.

Neural radiance fields and their applications

by Marek Kowalski | Scientist | Microsoft Mixed Reality & AI Lab

Show info

Biography:
Marek Kowalski is a scientist at the Microsoft Mixed Reality & AI Lab. In the last few years, he has been working mostly on applying neural rendering to the problem of telepresence. His other experience in computer vision and machine learning includes work on facial landmark localization, face recognition, camera pose estimation and 3D reconstruction. His PhD dissertation, completed at the Warsaw University of Technology, was selected as the best PhD dissertation of 2019 by the Polish Artificial Intelligence Society. In his free time Marek enjoys flying gliders and playing tennis.

Abstract:
The last few years have seen an explosion of research work on the topic of rendering images with the aid of machine learning. One of the most exciting recent inventions in this field are Neural Radiance Fields (NeRFs), which allow for synthesizing novel views of a scene with great quality and multi-view consistency. These ML models are based on multi-layer perceptrons and use very simple losses, which sets them apart from a lot of the previous work which was based on convolutional neural networks and used complex adversarial loss functions. Since the initial publication 2 years ago NeRFs have been applied to a great variety of problems including facial animation, 3D selfies, full-body human rendering and camera pose estimation. In this presentation I will talk about the benefits and the downsides of neural radiance fields and describe some improvements that have been proposed. I will also discuss their applications and describe how they can be made practical for problems like telepresence.

Algorithmic Balancing Models for Multi-stakeholder Recommendations

by Rishabh Mehrotra | Director - Machine Learning | ShareChat

Show info

Biography:
Rishabh Mehrotra is a Director of Machine Learning, and Head of Marketplace at Sharechat in London. He obtained his PhD in the field of Machine Learning and Information Retrieval from University College London where he was partially supported by a Google Research Award. His PhD research focused on bayesian inference of search tasks from search conversational interaction logs. His current research focuses on machine learning for marketplaces, multi-objective modeling of recommenders and creator economy. Some of his recent work has been published at conferences including KDD, WWW, SIGIR, NAACL, RecSys and WSDM. He has co-taught a number of tutorials at leading conferences including KDD, RecSys, WWW & CIKM, and taught courses at summer schools. Further details: website, Twitter.

Abstract:
Recommender systems shape the bulk of consumption on digital platforms, and are increasingly expected to not only support consumer needs but also benefit content creators and suppliers by helping them get exposed to consumers and grow their audience. Indeed, most modern digital platforms are multi-stakeholder platforms (e.g. AirBnb: guests and hosts, Youtube: consumers and producers, Uber: riders and drivers, Amazon: buyers and sellers), and rely on recommender systems to strive for a healthy balance between user, creators and platform objectives to ensure long-term health and sustainability of the platform. In this talk, we discuss a few recent advancements in multi-objective modeling spanning fuzzy aggregations, set transformers and reinforcement learning. While the main focus is on multi-objective balancing, the talk also touches upon related problems of trade-off handling, and user/content/creator understanding to support multi-stakeholder platform ecosystems. The talk ends by discussing learnings from the development and deployment of balancing approaches across 350+ million users on large scale recommendation platforms.

AI for creative applications & art

by Ivona Tautkute | Tech Lead and Senior AI Engineer | Tooploox

Show info

Biography:
Ivona is an Artificial Intelligence and Machine Learning Engineer & Researcher with a focus on Computer Vision. She holds a Master's and Bachelor's degree in Mathematics from the University of Warsaw and is currently pursuing a Ph.D. in Computer Science at the Polish-Japanese Academy of Information Technology. The topic of her doctoral thesis is """"Artificial neural networks for multimodal data embeddings and classification"""". Currently, Ivona works as a Tech Lead and Senior AI Engineer at a software company Tooploox. Her area of expertise involves image recognition, object detection, object segmentation, 3D data analysis, time series prediction, and generative adversarial networks. She has published her research at major AI/ML conferences and journals, such as CVPR, IEEE Access, ICONIP and has presented her work at technology conferences in Los Angeles, New York, Lausanne, Prague, Salt Lake City and more. Ivona is also a prominent AI artist, working with photography and GANs. Her artworks are exhibited at galleries worldwide and were recently sold at a prestigious auction house Sotheby's in New York.

Abstract:
The progress in the generative AI methods has allowed for their applications also in the creative industries. In this talk, I will briefly overview the technology and tools used behind AI-generated content, as well as present my applications of GANs to creative photography manipulations and animation creation – a project on creating animations on my photography collections based on GANs. Furthermore, the presentation will contain examples of AI in media content creation as well as an introduction to methods that can be used in that domain i.e., GAN, VQGAN, CLIP, DALL-E.

Methods for efficient management and deployment of complex deep learning systems: a SportsTech use case

by Wojciech Rosinski | CTO | ReSpo.Vision

Show info

Biography:
Wojciech Rosinski is CTO in ReSpo.Vision, a SportTech startup aiming to revolutionize football analytics by leveraging the latest, cutting-edge AI research to watch & analyze football games providing players, scouts, managers, clubs and federations with unmatched depth of knowledge. He has extensive experience in both R&D and industry projects, where he was working on diverse projects spanning computer vision, natural language processing and tabular data. He is a Kaggle Master with 2 gold medals and multiple high finishes.

Abstract:
Running complex machine learning workflows at scale requires a set of specialized tools covering different stages of the process. Methods of flexible pipeline parametrization, robust experiment tracking and job scheduling are among key ingredients to ensure that the workflow is easily manageable. I will describe the architecture and workflow that we use for managing our system enabling 3D data extraction from single-camera sports videos. The system combines multiple deep learning and machine learning models which have complex interdependencies. I will talk about the tools that we have chosen and developed to manage the pipelines, optimize the models’ performance and deploy the system in an efficient manner. In addition to this, sports data analytics use-cases based on 3D data, with focus on football, will be presented.

Challenges in developing Visual-Search system at Allegro

by Bartosz Ludwiczuk & Bartosz Paszko | Research Engineers | Allegro

Show info

Biography:
Bartosz Paszko is a Research Scientist at Allegro Machine Learning Research team and a graduate of Warsaw University of Technology. During his career he worked on a variety of projects in the field of computer vision. Privately, a fan of generative models, books and climbing. Bartosz Ludwiczuk also works at Allegro at MLR team. His main focus is techniques related to representation learning techniques, mainly in the image domain. Successfully developed and deployed systems related to Face-Recognition, Image-Retrieval, and Gait-Recognition. In his free time, he goes on bicycle trips with the child in a trailer.

Abstract:
As 'picture is worth a thousand words', image-search engines are one of the promising e-commerce techniques which would facilitate finding the right product in the immense marketplace. And this is a reason why in Allegro we are constantly developing such a feature, called Visual-Search. While transforming the POC into a fully functional feature, we tackled many topics related to Machine Learning and deployment strategy. In this talk, we will go through each step of this transformation. Starting from synthetic dataset creation for mock-up user behavior, merging visual recommendations from many departments, and proper validation schema of the whole, multi-model pipeline. As our journey had many ups and downs, we will summarize it with the lesson learned from developing the image search engine.

Accelerating AI processing on the edge

by Karol Gugala | Engineering Manager | Antmicro

Show info

Biography:
Karol Gugala is Engineering Manager at Antmicro, where he is working with open source in various contexts - primarily FPGA, Embedded software and AI. Open source enthusiast - involved in a wide variety of FOSS projects.

Abstract:
The demand for deploying machine learning models, especially state-of-the-art deep neural networks on edge devices is rapidly growing. Edge AI allows to run inference locally, without the need for a connection to the cloud, which makes the technologies more portable and self-sufficient. Without the need for sending the data to the cloud, edge AI solutions are also much safer in terms of data privacy. In the presentation we will introduce Kenning - an open source framework from Antmicro mitigates the compatibility issues between different edge devices from various vendors. The presentation will also discuss the problems of deploying AI algorithms on resource constrained devices. Efficient AI algorithm inference requires hardware acceleration - very often vendor specific and often leading to vendor locking. With Kenning it’s no longer an issue.

Deep Neural Deduplication

by Marcin Mosiolek | AI Architect | SII Poland

Show info

Biography:
Marcin is an AI Architect with over ten years of experience in a wide range of commercial machine learning projects, mainly related to natural language processing and computer vision. He converts the latest academic research into operating products in his daily job rather than Jupyter Notebooks only. After working hours, Marcin enjoys strong winds and rough seas while kitesurfing.

Abstract:
Effective identification of duplicated content is inherent in processing large amounts of text documents, such as web pages, articles or contracts. Recently, our team has developed a solution exploiting contrastive learning techniques to find duplicated content among thousands of construction documents in a blink of an eye. The talk shares our experience and introduces into contrastive learning.

A brief history of Neural TTS

by Andrew Breen | Senior Manager | Amazon TTS Research

Show info

Biography:
Andrew has a B.Sc. Hons, in Physics with Computing Physics from University College Swansea, an M.Sc. (Eng.) by research from Liverpool University, and a Ph.D. in Speech Science from University College London. Andrew worked at BT Labs. on ASR, and lead teams on TTS, Avatars and multi-modal distributed systems. in 1999 he joined the University of East Anglia as a Sr. Lecturer. He join Nuance in 2001 as founder for their TTS organisation, eventually becoming Director of TTS Research, and Product Development in India and China. In 2017 he joined Amazon as the Sr. Manager for research in Amazon’s TTS organization.

Abstract:
For many years “concatenative” speech synthesis was the industrial standard for text to speech technology. It provided relative high-quality (and in limited domains very high-quality) synthetic audio sufficient for wide spread commercial use. However, it had its limits. It required large amounts of pre-recorded audio spoken by professional voice talents, and recordings were constrained to a narrow range of expressivity. These constraints meant that many applications, which required expressivity or many voices, could not be adequately supported. The invention of Neural approaches for speech synthesis in 2016, appeared to offered a way forward. By 2018, Amazon scientists had demonstrated that by using a generative neural network approach they could produce natural sounding expressive speech (e.g. Newscaster style) using only a fraction of the data previously needed to generate “neutral” speech. This advance paved the way for Alexa and other Amazon services to adopt different speaking styles in different contexts, improving customer experiences. This talk will place speech synthesis in a historical perspective, review current neural approaches and summaries future challenges and opportunities.

An introduction to quantum machine learning

by Paweł Gora | CEO | Quantum AI Foundation

Show info

Biography:
Paweł Gora is a scientist, IT specialist and entrepreneur working mostly on the applications of AI and quantum computing (especially in transportation and medicine). He is a founder and CEO of the "Quantum AI Foundation" http://www.qaif.org - a charity organization aiming to support education, research, development and collaboration in science and new technologies, especially AI and Quantum Computing. He also moderates the "Quantum AI" group (https://www.facebook.com/groups/quantumai), is one of the organizers Warsaw.ai and Warsaw Quantum Computing Group meetups and a Board member of QWorld https://qworld.net and its local division - QPoland.

Abstract:
In this talk, I will give an introduction to quantum machine learning. First, I will explain what quantum computing is and how it is different from classical computing paradigm. Later, I will tell how machine learning could be enhanced using quantum computing, presenting some of the possible opportunities and existing challenges. Finally, I will present how machine learning could be applied in the quantum computing domain, and outline how one can continue education and start a professional career in the quantum machine learning domain.

Semantic information extraction

by Mateusz Półtorak | Senior Data Scientist | Pearson

Show info

Biography:
Mateusz's great passion is modeling and mathematical approximation of complex mechanisms that rule our world. For over two years now, his work has been focusing mainly on natural language processing. Currently, he focuses on merging two worlds: world of NLP along with real word use cases, such as automatic assessment of English open-ended language tests. Privately, he worships all kinds of art.

Abstract:
Semantic information extraction is a hot topic (as the whole NLP). Semantic extraction is applied in a variety of products, such as automated assistants, chatbots, or intelligent tutoring systems. This talk will focus on natural language processing techniques that are designed to extract semantic information from text input. During the talk we will spot the differences between paraphrase detection and natural language inference. This will allow us to define which use cases can benefit from these methods. The talk will provide examples from the field of English language learning, where applications often need to infer if learners’ unstructured responses match the model answer. We will touch on topics such as deep language models, sentence embeddings, and zero-shot learning.

How AI boosted Audio Processing?

by Adam Kupryjanow | AI Applied Research Sientist | Intel

Show info

Biography:
Adam is AI Applied Research Scientist in the Intel Audio team in Gdansk, Poland. He received a Ph.D from Gdansk University of Technology (GUT) in 2013. He worked on methods for speech intelligibility improvement. At Intel he was involved in many projects devoted to speech and voice processing. He developed algorithms and IP blocks like beamforming, reverb reduction, noise reduction or automatic gain control. That solutions are used in client products (laptops) and IOT (smart speakers, smart fridges, smart microwaves, kitchen robots).

Abstract:
Historically all of audio signal processing was done using traditional digital signal processing techniques (DSP). Two main issues with neural networks were related with compute complexity and quality of processing audio signals. In 2016 neural network called wave-net was designed. This was the first network that allows to perform high quality noise reduction, but its compute complexity was to high. It was impossible to run it in a real-time on regular PC. Then in 2017 research was moved one step further with u-net architecture. U-net was taken from the image processing and utilized for audio processing. It gives better quality than wave-net, but still it was not possible to perform inference in real-time on a regular laptop. In 2019, idea to use convolution neural networks to perform the noise reduction was proposed. This was the beginning of the real AI audio processing. During this presentation I will demonstrate how neural networks (NNs) are used to enhance speech and voice signals.

Training (vision) models on geodata for the protection of water reservoir ecosystems

by Paulina Knut & Zuzanna Szafranowska-Skorupko | Senior Data Scientist, Senior Software Engineer | deepsense.ai

Show info

Biography:
Paulina is a Senior Data Scientist at deepsense.ai. She is mainly interested in data visualization, in particular model results. She graduated from the University of Warsaw, Faculty of Mathematics, Informatics and Mechanics. During her work at deepsense.ai Paulina was involved in many commercial and R&D projects. Zuzanna is a Senior SE in ML at deepsense.ai, focusing on design and creation of computer vision based solutions. Apart from her work at deepsense.ai, she's also a part-time freelance researcher, collaborating with University of Barcelona in projects related to GANs in medical imaging. She has studied and/or worked in Warsaw, Berlin and Barcelona, holds MSc in Computer Vision, and has 5+ years of professional experience in DL & SE. Her greatest satisfaction comes from creative teamwork in discovering new applications of under-explored ideas.

Abstract:
We’ll share with you our recent experience of working with geo data in the context of a project developed for the The World Wide Fund for Nature (WWF). The challenge of this project is designing a model for the classification of water reservoirs. But our task is not a standard one. For input, we’re only using the object's shape and location, as well as its location in relation to other water reservoirs. What’s the best way to take advantage of this information? During our presentation we’ll describe geo data and how to best handle it, and we’ll discuss potential solutions to the problem. Then, we’ll reveal which methods were the most successful in this task and what the results we were able to achieve so far have been.

Ligands classification using sparse convolutional neural networks

by Jacek Karolczak | Student | Poznan University of Technology

Show info

Biography:
Jacek Karolczak is an undergraduate student at the Poznań University of Technology and a member of the Group of Horribly Optimistic STatisticians (GHOST). Currently working as a Junior Machine Learning Engineer at AI REV.

Abstract:
"Structure-guided drug design is a field of study that aims to create novel drugs based on potential molecular interactions with proteins in our bodies. This approach to drug design requires simulations or structural biology experiments that determine small molecules (ligands) that bind to therapeutic targets. One of the steps in validating a properly bound molecule is ligand labelling. This process requires expert knowledge and can be time-consuming. In the past few years, classical machine learning approaches have succeeded in partially automating this task through ligand classification based on their 3D electron density shape. Meanwhile, rapid advancement has been observed in the place recognition problems, where the task is to classify 3D terrains. The root of this advancement is the development of models based on convolutional neural networks for spatially sparse data. In the presentation, I will talk about our attempts to use of state-of-the-art place recognition models for ligand classification. I will elaborate on the similarities between place and ligand recognition, and discuss the benefits of using sparse CNNs compared to currently used methods. Finally, I will present the TransLoc3D architecture and the experimental results obtained using this model."

FairPAN - Achieving fairness through neural networks

by Hubert Ruczyński | Student | Warsaw Univeristy of Technology

Show info

Biography:
I'm on 6th semester of Data Science studies on Faculty of Mathematics and Information Science in Warsaw University of Technology (MiNI PW). Last summer I've been a part of FairPAN project during MI2 DataLab Internship and this talk is an effect of that studies. The aforementioned internship was my first scientific and work experience and I'm really grateful for that possibility

Abstract:
With the ongoing development of XAI methods, we — data scientists have to take more and more responsibility for the behaviour of our models. It means that we should no longer care only about what are the results and the level of their accuracy, but also how are they obtained. One way to take care about it is to measure the fairness of our models and fight with the bias present inside them. Most known approaches to bias mitigation are however very costly in terms of performance reduction. Fair Predictive Adversarial Network however is a solution which loses less performance than many methods out there. It is based on the idea of GANs (Generative Adversarial Networks) where we engage two neural networks into the training process. The first one (classifier) performs a classification task whereas the second one (adversarial) has to predict a sensitive class of the observation based only on the outcome of the classifier. During this zero-sum game classifier is punished every time when the adversarial makes mistake, so it starts to learn how to cheat the second model. It leads to the moment when the adversarial starts to guess the sensitive value of our column with 50% chance of success, which means that the adversarial cannot predict the sensitive value, thus making its outputs fair. The result of this research is an R package called `fairpan`, which implements all methods and enables its users to easily use them.

Is music a natural language?

by Sebastian Chwilczyński | Student | Poznan University of Technology

Show info

Biography:
Sebastian Chwilczyński is a 2nd year student of Artificial Intelligence at Poznań University of Technology, music enthusiast and GHOST member. He has spent last six months working in Poznan Supercomputing and Networking Center solving Computer Vision problems. Currently he is trying to merge his two passions by exploring the world of Machine Learning for Audio Signal Processing and doing some projects concerning it. His favorite part of learning new method is to understand all the math behind it.

Abstract:
From the AI perspective, one of the most interesting features of natural languages is the possibility of predicting the subsequent word given some context. Our chances of guessing the next word are known to be bigger the more context we have, which can be shown by measuring conditional entropy. In my presentation, we will explore whether we can perform the same kind of analysis for music. If yes, does music behaves like other natural languages? Are some genres (languages) more complex than others? Can we successfully process music using Machine Learning techniques made for languages? One approach is to transform music in the MIDI format to plain text using ASCII coding and then perform the analysis. Moreover, we can fit LSTM model using that ""musical"" text. Unfortunately, many problem arises. Music programs don't understand such text, transforming back to MIDI is necessary. We don’t know how many units of time n generated characters will span. MIDI lacks the “human” feel. Finally, creating datasets may be problematic with absence of some songs in MIDI. More sophisticated methods are needed. GAN's and Wave2Midi2Wave for the rescue!