4월, 2019의 게시물 표시

[PyData] Kannappan Sirchabesan: Slack slash app using Google Cloud Functions

이미지
PyData London Meetup #54 Tuesday, March 5, 2019 Google Cloud Functions is a lightweight server less way to create single-purpose, stand-alone functions that respond to events. It can be used to build event-driven microservices. This lightning talk will demonstrate the ease of developing Cloud Functions in Python by building a Slack slash app that makes use of Google Knowledge Graph Search API to retrieve quick knowledge snippets of real-world entities like people, places from within Slack. Sponsored & Hosted by Man AHL **** www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities appr

[PyData] Data Engineering Principles - Build frameworks not pipelines - Gatis Seja

이미지
PyData London Meetup #54 Tuesday, March 5, 2019 Data pipelines are necessary for the flow of information from its source to its consumers, typically data scientists, analysts and software developers. Managing data flow from many sources is a complex task where the maintenance cost limits scale of being able to build a large reliable data warehouse. This presentation proposes a number of applied data engineering principles that can be used to build robust easily manageable data pipelines and data products. Examples will be shown using Python on AWS. Sponsored & Hosted by Man AHL **** www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, proce

[PyData] Visualizing & Analyzing Earth Science Data Using PyViz & PyData

이미지
PyData Ann Arbor Meetup - April 10, 2019 Sponsored by NumFOCUS and TD Ameritrade https://www.meetup.com/PyData-Ann-Arbor/ Earth Science presents interesting issues of large, multi-dimensional datasets stored in a variety of idiosyncratic file formats. In this talk, we'll work through some specific workflows and explore how various tools - such as Intake, Dask, Xarray, and Datashader - can be used to effectively analyze and visualize these data. Working from within the notebook, we'll iteratively build a product that is interactive, scalable, and deployable. ----- Julia Signell is a software developer at Anaconda Inc. currently working on developing best practices for Python-using earth scientists. She works on visualization tools within the PyViz ecosystem and data ingestion and analysis tools in the broader PyData world. She lives in Philadelphia and previously did hydrology research at Princeton - studying lightning and rain patterns, water movement through the land

[PyData] Deep Learning for Named Entity Recognition - Kfir Bar

이미지
PyData Tel Aviv Meetup #22 3 April 2019 Sponsored and Hosted by SimilarWeb https://www.meetup.com/PyData-Tel-Aviv/ Named Entity Recognition is one of the key tasks in commercial Natural Language Processing applications. Its objective is to identify named entity mentions, such as people, organizations, and locations, in running text. State-of-the-art approaches are purely data-driven, leveraging deep neural networks. In this talk, I will present a few of those works, followed by a description of our own deep NER implementation, based on TensorFlow. We'll look at accuracy, speed, and memory footprint, while comparing some of the best known deep architectures with a basic statistical approach. www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData networ

[PyData] Learning Large Scale Models for Content Recommendation - Sonya Liberman

이미지
PyData Tel Aviv Meetup #22 3 April 2019 Sponsored and Hosted by SimilarWeb https://www.meetup.com/PyData-Tel-Aviv/ Serving tens of billions of personalized recommendations a day under a latency of 30 milliseconds is a challenge. In this talk I'll share our algorithmic architecture, including its Spark-based offline layer, and its Elasticsearch-based serving layer, that enable running complex models under difficult scale constraints and shorten the cycle between research and production. www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages,

[PyData] Learning with Label Noise - Yaniv Katz

이미지
PyData Tel Aviv Meetup #22 3 April 2019 Sponsored and Hosted by SimilarWeb https://www.meetup.com/PyData-Tel-Aviv/ Labeled data containing incorrect labels, termed label noise, has gained much attention in machine learning research due to its adverse impact on supervised models. This effort has increased in recent years, as the usage of larger data sets, which are more prone to label noise, has become prevalent. To tackle this problem, studies have explored the sensitivity of the learning process to label noise and devised robust methodologies to overcome it. This talk covers basic concepts in label noise research and explores suggested approaches for overcoming its negative effects. It also showcases two practical examples of easy-to-use methods which were tested on training sets contaminated by label noise and by target value noise. www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum fo

[PyData] Unit Testing Data with Marbles - Jane Stewart Adams, Leif Walsh

이미지
PyData NYC 2018 In the same way that we need to make assertions about how code functions, we need to make assertions about data, and unit testing is a promising framework. In this talk, we'll explore what is unique about unit testing data, and see how Two Sigma's open source library Marbles addresses these unique challenges in several real-world scenarios. Slides - https://www.slideshare.net/PyData/uni... === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences

[PyData] Train, Evaluate, Repeat: Building a Credit Card Fraud Detection System - Leela Senthil Nathan

이미지
PyData NYC 2018 This talk covers three major ML problems Stripe faced (and solved!) in building its credit card fraud detection system: choosing labels for fraud that work across all merchants, addressing class imbalance (legitimate charges greatly outnumber fraudulent ones), and performing counterfactual evaluation (to measure performance and obtain training data when the ML system is changing outcomes itself). === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim

[PyData] Privacy Isn't Dead

이미지
PyData Ann Arbor Meetup - February 6, 2019 Sponsored by NumFOCUS, TD Ameritrade, and MIDAS https://www.meetup.com/PyData-Ann-Arbor/ Helen Odom | Privacy Isn't Dead Every active and budding data scientist needs to be informed of modern privacy and data handling laws and regulations. Join us for a whirlwind tour of hot topics that will cover IoT, data breaches, cybersecurity, data ethics, and data ownership. This talk will also provide a general overview and an educational look back at how we got here. ----- Helen Odom, CIPP/US, currently serves as Senior Legal Counsel at TD Ameritrade, where she helps the business work through complex legal issues at the intersection of technology, media, brand, and innovation.

[PyData] Protein Design by Multi-Objective Optimisation - Eyal Kazin

이미지
PyData London Meetup #53 Tuesday, February 5, 2019 In this talk I will discuss Pareto Optimisation, a method for finding optimal solutions of multiple objective functions, and demonstrate an application in protein design. Sponsored & Hosted by Man AHL **** www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features

[PyData] AI over Auscultation : Improving the Early Diagnostic - Kevin Machado

이미지
PyData London Meetup #53 Tuesday, February 5, 2019 This project is about how we are contributing to the early diagnostic of Cardiovascular disease improving a basic medical tool through the use of signal processing techniques, user interface design and machine learning. Sponsored & Hosted by Man AHL **** www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and t

[PyData] Oasis: an Open Source Framework for the Modelling of Natural Disaster Risk - Mark Pinkerton

이미지
PyData London Meetup #53 Tuesday, February 5, 2019 Open catastrophe modelling using Oasis LMF - history and use of catastrophe models in (re)insurance, and the Oasis mission to provide choice and insight through open software and standards - anatomy of a cat model - simple example using Jupyter, Pandas, Bokeh + the oasislmf package - the future - understanding model uncertainty, application of ML to model building, new data sources Sponsored & Hosted by Man AHL **** www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not

[PyData] The how and why of Elasticsearch - Honza Král - PyData Prague, January 2019

이미지
Elasticsearch is an open source distributed datastore, we will see what makes it tick, how to use it from Python and try to draw some conclusions as to where it might fit in in your organization. Honza Král is a principal consulting architect from Elastic and a long-term core committer to Django. https://www.meetup.com/PyData-Prague www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations

[PyData] Optimizing numerical calculations in Python - Jakub Urban - PyData Prague, January 2019

이미지
Jakub Urban, a senior Pythonista from Quantlane with rich experience in scientific computing and modelling, will show various possibilities for making your (mostly) numerical calculations in Python fast. He will cover optimization and parallelization using Numpy, Numba, Cython or Dask. You will learn that Python can be as fast as Fortran with a very little effort. In case it cannot, you will see how to seamlessly turn Fortran/C/C++ into a Python module. https://www.meetup.com/PyData-Prague www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages,

[PyData] Similarity learning using deep neural networks - Jacek Komorowski

이미지
PyData Warsaw 2018 Deep neural network give very good results in visual object recognition tasks, but they require large number of training examples from each category. I'll present a class of neural network architectures, that can be used when only few training examples from each class are available. They are based on 'similarity learning' concept and can be used to solve various practical problems. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim

[PyData] Visualize, Explore and Explain Predictive ML Models - Przemyslaw Biecek

이미지
PyData Warsaw 2018 Why you need tools for exploration, visualisation and explanation for predictive models? During the talk I will present use cases in which model interpretability is crucial and overview tools that support model interpretability (Ceteris Paribus, Break Down, LIME, live, auditor). See more at DALEXverse: https://github.com/pbiecek/DALEX === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to adva

[PyData] A deep revolution in speech processing and analysis - Pawel Cyrta

이미지
PyData Warsaw 2018 In the past two years, we’ve seen the industry discovery of speech as a critical interface protocol between humans and machines, especially for cloud-based information queries driving by speech recognition. These create significant new opportunities for every application that touches audio or video - opening up new potential for improved intelligibility, personalisation and customer “stickiness”. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences

[PyData] Step by step face swap - Sylwester Brzęczkowski

이미지
PyData Warsaw 2018 I will present how to implement a face swap. We will go step by step from simple “copy & paste” face on another image to fully functioning and nice looking face swap. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

[PyData] Can you trust neural networks? - Mateusz Opala

이미지
PyData Warsaw 2018 Recently neural networks have become superior in many machine learning tasks. However, they are more difficult to interpret than simpler models such as decision trees. Such a condition is not acceptable in industries like healthcare or law. In this talk, I will talk on unified approach to explain the output of any machine learning model, especially neural networks. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and communi

[PyData] Recognizing products from raw text descriptions using “shallow” and “deep” machine learning

이미지
PyData Warsaw 2018 Recognizing products from raw text descriptions using “shallow” and “deep” machine learning - We will compare “shallow” and “deep” machine learning approaches to solving a natural language processing problem. Pros, cons and consequences of both choices will be discussed. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attende

[PyData] So you want to be a Python expert? | PyData Seattle 2017

이미지
www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

[PyData] AI meets Art - Agata Chęcińska

이미지
PyData Warsaw 2018 Intersection of artificial intelligence and art world is an intriguing concept. I will show several examples of this intersection, with the focus on the AI based "Museum Treasures" game, a winning solution from the HackArt: a hackathon organised by the National Museum in Warsaw. I will also touch on the topic of analogies between creative processes and machine learning processes. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to

[PyData] Deep Learning Semantic Segmentation for Nucleus Detection - Dawid Rymarczyk

이미지
PyData Warsaw 2018 Semantic segmentation is the process which aims to classify individual pixels of an image. Recently, Kaggle hosted the 2018 Data Science Bowl competition dedicated to nucleus detection and segmentation based on microscopic images. In this talk, I will present two approaches to this problem, based on U-Net and Mask R-CNN. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level pres

[PyData] Spot the difference: train your image analytics model to recognize fine grained similarity

이미지
PyData Warsaw 2018 Imagine two images of the same car model, same color and with small scratches on bumpers. How would you make a machine look at the scratches and decide if they are the same? This is a story about the implementation of a new structure of neural network trained on “triplets” of images which recognizes fine grained images similarity based on Deep Ranking algorithm. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-

[PyData] Spammers vs. Data: My everyday fight - Juan De Dios Santos

이미지
PyData Warsaw 2018 LOVOO, a dating app, is attractive to spam. These spammers disguise themselves as real users, using believable images and accurate descriptions on their profiles to lure our community into chatting with them. This presentation is on everything Antispam. In it, I will talk about the architecture of the system, data, and the algorithms and methods we employ to fight the spammers. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessibl

[PyData] Can you enhance that? Single Image Super Resolution - Katarzyna Kańska

이미지
PyData Warsaw 2018 The talk will introduce a problem of upscaling a picture with as small loss of quality as possible using deep learning techniques. What metric to use when evaluating the solution? Upsample the image in the beginning or near the end of your neural network? Which upscaling layers to use? Answers to these and more questions on this topic will be discussed. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, w

[PyData] Hitting the gym: controlling traffic with Reinforcement Learning - Steven Nooijen

이미지
PyData Warsaw 2018 Finally a good real-life use case for Reinforcement Learning (RL): traffic control! In this talk I will show you how we hooked up traffic simulation software to Python and how we built our own custom gym environment to run RL experiments with keras-rl for a simple 4-way intersection. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks

[PyData] Predicting preterm birth with convolutional neural nets - Tomasz Włodarczyk, Szymon Płotka

이미지
PyData Warsaw 2018 The desirable date of the birth of a child follows the full duration of pregnancy. According to WHO data, 15 million children are born prematurely every year, of which 1.1 million dies, unfortunately. In this talk, we will present how to improve prediction rate of the spontaneous preterm delivery using deep learning and computer vision methods. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novic

[PyData] Distributed deep learning and why you may not need it - Jakub Sanojca, Mikuláš Zelinka

이미지
PyData Warsaw 2018 Deep learning thrives with always bigger networks and always growing datasets but single machine can only handle so much. When to scale to multiple machines and how do do it efficiently? What pros and cons available options have and what is theory behind their approach to distributed training? In this talk we will answer those questions and show what problems we are trying to solve at Avast. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim

[PyData] Uncertainty estimation and Bayesian Neural Networks - Marcin Możejko

이미지
PyData Warsaw 2018 We will show how to assess the uncertainty of deep neural networks. We will cover Bayesian Deep Learning and other out-of-distribution detection methods. The talk will include examples that will show how to implement the methods in Pytorch. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features

[PyData] It is never too much: training deep learning models with more than one modality - Adam Słucki

이미지
PyData Warsaw 2018 Using visual features alone is not enough to fully exploit content of social media videos. We propose a whole pipeline for extracting textual data embedded in videos and fusing them with the original visual data to predict drops in viewer’s retention. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest projec

[PyData] Computer vision challenges in drug discovery - Maciej Hermanowicz

이미지
PyData Warsaw 2018 I will present a high-level overview of how automated image analysis approaches can be incorporated into pharmaceutical discovery pipelines. By taking a look at two GSK case studies I will demonstrate how to apply computer vision techniques to featurize imaging data, enabling the use of standard machine learning algorithms. I will highlight how these techniques benefit the drug discovery process. === www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences

[PyData] Data Pipeline Hyperparameter Optimization - Alex Quemy

이미지
PyData Warsaw 2018 It is commonly accepted that about 80% of data scientists time is spent on preparing data, including setting up the proper data pipeline or ETL. For a large part, the proper configuration of a given data pipeline is the result of the data scientist experience and Subject Matter Expert knowledge, plus a dose of arbitrary decisions. What if most of this work could be automated? Better, is it possible to find some universal pipeline configurations that can work well on a wide range of domains and thus transfer what has been learn on one dataset to another? In this presentation, we show on a PoC that Sequential Model-Based Optimization techniques can be used to tune data pipeline hyperparameters in order to improve model accuracy. We discuss how to measure if optimal configurations are algorithm-specific or independent, shows that, in the specific case of NLP preprocessing operators, there might exist some kind of generally good configurations, independently of th

[PyData] PyData Ann Arbor: Madicken Munk | Widgyts: yt Jupyter Widgets for Volumetric Data Exploration

이미지
PyData Ann Arbor Meetup - March 13, 2019 Sponsored by NumFOCUS and TD Ameritrade https://www.meetup.com/PyData-Ann-Arbor/ PyData Ann Arbor: Madicken Munk | Widgyts: yt Jupyter Widgets for Volumetric Data Exploration yt is an open source data visualization package for mesh- and point- based data. It has been used to visualize data in a number of scientific domains, including astrophysics, nuclear engineering, seismology, and oceanography. Often this data is large, interesting, and complex. widgyts is yt's custom jupyter widgets package using Rust compiled to WebAssembly to enable interactive exploration of data with yt. This talk will cover some of the technical aspects of widyts that allow for performant visual data exploration, the path forward for widyts with yt, and include some live demos of widgyts with real data. ----- Madicken Munk is a postdoc in the Data Exploration Lab at the National Center for Supercomputing Applications at Urbana-Champaign. Madicken received

[PyData] Deep Learning on Mobile Devices - William Grisaitis

이미지
PyData Miami Meetup - March 5, 2019 https://www.meetup.com/Miami-Machine-... While GPUs have been instrumental in the deep learning revolution since 2012, smartphones can also run deep neural networks on their own hardware and exceed state-of-the-art image classification performance from just a few years ago. This is mostly due to advances in neural network design and model structures. Innovations like the depth-wise separable convolution, for example, have enabled more efficient computation in neural nets. Hardware has also advanced in terms of compute and memory capacity. Put together, a smartphone today can quickly classify images with a lightweight neural network with a higher accuracy than AlexNet achieved in 2012. ---- William Grisaitis is a machine learning engineer and curious person based in Orlando, Florida. William has worked in deep learning since 2016 when he joined a deep learning research lab at the HHMI Janelia Research Campus and contributed to peer-reviewed

[O'Reilly] Jupyter widgets- Maarten Breddels (Maarten Breddels), Sylvain Corlay (QuantStack)

이미지
Project Jupyter aims to provide a consistent set of tools for data science workflows, from the exploratory phase of the analysis to the sharing of the results. Maarten Breddels and Sylvain Corlay offer an overview of Jupyter’s interactive widgets framework, which enables rich user interaction, including 2D and 3D interactive plotting, geographic data visualization, and much more, as well as the main interactive visualization libraries that were built upon it. More than a set of controls, Jupyter widgets are a framework upon which one can build. Just as the kernel protocol is agnostic to the programming language, Jupyter widgets enable bidirectional communication between the web browser and the kernel in a language-agnostic fashion. Beyond the Python reference implementation, C++, R, and JVM languages (Java, Clojure, Scala, Kotlin, and Groovy) have joined this exciting development, allowing users to reuse all the frontend code. Join Maarten and Sylvain to explore the capabilitie

[O'Reilly] Visualizing machine learning models in the Jupyter Notebook- Chakri Cherukuri (Bloomberg LP)

이미지
Chakri Cherukuri offers an overview of the interactive widget ecosystem available in the Jupyter notebook, including ipywidgets and bqplot, and illustrates how Jupyter widgets can be used to build rich visualizations of machine learning models. Along the way, Chakri walks you through algorithms like regression, clustering, and optimization and shares a wizard for building and training deep learning models with diagnostic plots. This session is sponsored by Bloomberg LP. Subscribe to O'Reilly on YouTube: http://goo.gl/n3QSYi Follow O'Reilly on: Twitter: http://twitter.com/oreillymedia Facebook: http://facebook.com/OReilly Instagram: https://www.instagram.com/oreillymedia LinkedIn: https://www.linkedin.com/company-beta...

[O'Reilly] Making beautiful objects with Jupyter- M Pacer (Netflix)

이미지
Jupyter displays a rich array of media types out of the box. M Pacer explains how to use these capabilities to their full potential, covering how to add rich displays to existing and new Python classes and how to customize the way notebooks are converted to other formats. You’ll learn what MIME types are and how to use them, explore Jupyter’s display mechanisms and protocol, and dive into nbconvert. These skills will enable you to make beautiful objects with Jupyter. Topics include: How to enhance the display objects and classes in the notebook By adding metadata to any output via IPython.display By adding custom repr methods By using updatable displays Libraries that enable new kinds of displays, including: vdom: A Python library for React-like declarative layouts display_xml: A Python library for displaying highlighted, indented XML How to convert notebooks into custom objects using nbconvert, which allows: Hiding prompts and code cells to show only the output figures HTML wh

[O'Reilly] Jupyter at Netflix - Kyle Kelley (Netflix)

이미지
So, Netflix's data scientists and engineers. . .do they know things? Join Kyle Kelley to find out. Kyle explores how Netflix uses Jupyter and explains how you can learn from Netflix's experience to enable analysts at your organization. Subscribe to O'Reilly on YouTube: http://goo.gl/n3QSYi Follow O'Reilly on: Twitter: http://twitter.com/oreillymedia Facebook: http://facebook.com/OReilly Instagram: https://www.instagram.com/oreillymedia LinkedIn: https://www.linkedin.com/company-beta...

[O'Reilly] Building Interactive Applications and Dashboards in the Jupyter Notebook

이미지
Romain Menegaux (Bloomberg LP), Chakri Cherukuri (Bloomberg LP) demonstrate how to develop advanced applications and dashboards using open source projects, illustrated with examples in machine learning, finance, and neuroscience. Subscribe to O'Reilly on YouTube: http://goo.gl/n3QSYi Follow O'Reilly on: Twitter: http://twitter.com/oreillymedia Facebook: http://facebook.com/OReilly Instagram: https://www.instagram.com/oreillymedia LinkedIn: https://www.linkedin.com/company-beta...

[O'Reilly] Derivatives Analytics with Python - O'Reilly Webcast

이미지
Originally aired June 24, 2014. In this webcast you will learn how Python can be used for Derivatives Analytics and Financial Engineering. Dr. Yves J. Hilpisch will begin by covering the necessary background information, theoretical foundations and numerical tools to implement a market-based valuation of stock index options. The approach is a practical one in that all-important aspects are illustrated by a set of self-contained Python scripts. This webcast talk will cover: Financial Algorithm Implementation Monte Carlo Valuation Binomial Option Pricing Performance Libraries Dynamic Compiling Parallel Code Execution DX Analytics Overview and Philosophy Multi-Risk Derivatives Pricing Global Valuation Web Technologies for Derivative Analytics About Yves Hilpisch Yves Hilpisch has 10 years of experience with Python, particularly in the finance space. He founded The Python Quants GmbH - an independent, privat

[O'Reilly] O'Reilly Webcast: TDD Web Development from Scratch with Python

이미지
In this hands-on webcast presented by Harry Percival author of Test-Driven Development with Python, you will learn: How to use TDD to build a web application from the ground up Full functional testing using the Selenium browser automation tool Unit tests for all aspects of Django: urls views models templates Who should attend this event: This live webcast is suitable for relative beginners, you should know basic Python, but if you've never used TDD or Django you should be fine. About Harry Percival After an idyllic childhood spent playing with BASIC on French 8-bit computers like the Thomson T-07 whose keys go "boop" when you press them, Harry spent a few years being deeply unhappy with Economics and management consultancy. Soon he rediscovered his true geek nature, and was lucky enough to fall in with a bunch of XP fanatics, working on the pioneering but sadly defunct Resolver One spreadsheet. He now works at PythonAn

[O'Reilly] Managing Large Datasets with Python and HDF5 - O'Reilly Webcast

이미지
Are you using Python to process large numerical datasets? Over the past few years, the Hierarchical Data Format (HDF5) has emerged as the mechanism of choice for processing, archiving and sharing scientific datasets ranging from gigabytes to terabytes and beyond. With a diverse user base spanning the range from NASA to the financial industry, HDF5 lets you create high-performance, portable, self-describing containers for your data. HDF5's flexibility and speed make it particularly well-suited to analysis in Python. This webcast provides a practical, Python-based introduction to the world of HDF5. This webcast led by Andrew Collette will cover: - The basics of the format - Performance - Best practices for making sharable data files which can be read by colleagues on other platforms About Andrew Collette Andrew Collette holds a Ph.D. in physics from UCLA, and works as a laboratory research scientist at the University of Colorado. He has worked with the Python-NumPy-HDF5 st

[ todaycode오늘코드] [15] Pandas 기초 - 파이썬 판다스로 groupby 활용하여 다양한 데이터 집계를 활용하기

이미지
Pandas 기초 - 파이썬 판다스로 groupby 활용하여 다양한 데이터 집계를 활용하기 * https://pandas.pydata.org/Pandas_Chea... * 판다스 10분 완성 : https://dataitgirls2.github.io/10minu...

[todaycode오늘코드] [14] Pandas 기초 - merge로 데이터프레임 합치기 left, right, inner, outer 옵션 사용하기 - 파이썬 판다스로 데이터 분석

이미지
Pandas 기초 - merge로 데이터프레임 합치기 left, right, inner, outer 옵션 사용하기 * https://pandas.pydata.org/Pandas_Chea... * 판다스 10분 완성 : https://dataitgirls2.github.io/10minu...