Archive for the ‘machine learning’ Category

Microsoft open sources algorithm that gives Bing some of its smarts

May 15th, 2019
The Eiffel Tower.

Enlarge / The Eiffel Tower. (credit: Pedro Szekely)

Search engines today are more than just the dumb keyword matchers they used to be. You can ask a question—say, "How tall is the tower in Paris?"—and they'll tell you that the Eiffel Tower is 324 meters (1,063 feet) tall, about the same as an 81-story building. They can do this even though the question never actually names the tower.

How do they do this? As with everything else these days, they use machine learning. Machine-learning algorithms are used to build vectors—essentially, long lists of numbers—that in some sense represent their input data, whether it be text on a webpage, images, sound, or videos. Bing captures billions of these vectors for all the different kinds of media that it indexes. To search the vectors, Microsoft uses an algorithm it calls SPTAG ("Space Partition Tree and Graph"). An input query is converted into a vector, and SPTAG is used to quickly find "approximate nearest neighbors" (ANN), which is to say, vectors that are similar to the input.

This (with some amount of hand-waving) is how the Eiffel Tower question can be answered: a search for "How tall is the tower in Paris?" will be "near" pages talking about towers, Paris, and how tall things are. Such pages are almost surely going to be about the Eiffel Tower.

Read 2 remaining paragraphs | Comments

Posted in machine learning, microsoft, Open Source, Tech | Comments (0)

Why Google believes machine learning is its future

May 10th, 2019
Google CEO Sundar Pichai speaks during the Google I/O Developers Conference on May 7, 2019.

Enlarge / Google CEO Sundar Pichai speaks during the Google I/O Developers Conference on May 7, 2019. (credit: David Paul Morris/Bloomberg via Getty Images)

One of the most interesting demos at this week's Google I/O keynote featured a new version of Google's voice assistant that's due out later this year. A Google employee asked the Google Assistant to bring up her photos and then show her photos with animals. She tapped one and said, "Send it to Justin." The photo was dropped into the messaging app.

From there, things got more impressive.

"Hey Google, send an email to Jessica," she said. "Hi Jessica, I just got back from Yellowstone and completely fell in love with it." The phone transcribed her words, putting "Hi Jessica" on its own line.

Read 38 remaining paragraphs | Comments

Posted in AI, google, machine learning, pixel, Tech, TPU | Comments (0)

Blockchain, zero-code machine learning coming to Azure

May 3rd, 2019
Blockchain, zero-code machine learning coming to Azure

Enlarge (credit: Caetano Candal Sato / Flickr)

Microsoft's annual developer conference kicks off on Monday, and the company will no doubt have all manner of things to announce for Azure and, if we're lucky, Windows. To whet our appetites, the company has unveiled a crop of new Azure and Internet-of-Things services with, as we should no doubt expect these days, a focus on machine learning and blockchain.

First up are some new capabilities under the cognitive-services banner. These are the services that are most similar to human cognition: image recognition, speech-to-text, translation, and so on. Microsoft is adding a new category of service that it's calling "Decision." In this category are services that make recommendations to aid decision-making. Microsoft is putting some existing services into this category: Content Moderator (which tries to automatically detect offensive or undesirable text, images, and video) and Anomaly Detector (which examines time series data to find outlier or anomalous events). To these, Microsoft is adding Personalizer, which learns about a user's preferences and makes recommendations accordingly.

Microsoft is also offering previews of its Ink Recognizer (which turns handwriting into machine-readable text) and Form Recognizer, which can extract structured data from hand-filled forms. Cognitive Search, which uses machine learning to enable searching across disparate data types (such as OCR-scanned images, PDFs, and handwritten notes) is being promoted to general availability.

Read 3 remaining paragraphs | Comments

Posted in azure, cloud, edge computing, IoT, machine learning, microsoft, Tech | Comments (0)

OpenAI bot crushes Dota 2 champions, and now anyone can play against it

April 15th, 2019
Screenshot of a fiery video game monster.

Enlarge / Shadow Fiend, looking shadowy and fiendish. (credit: Valve)

Over the past several years, OpenAI, a startup with the mission of ensuring that "artificial general intelligence benefits all of humanity," has been developing a machine-learning-driven bot to play Dota 2, the greatest game in the universe. Starting from a very cut-down version of the full game, the bot has been developed over the years through playing millions upon millions of matches against itself, learning not just how to play the five-on-five team game but how to win, consistently.

We've been able to watch the bot's development over a number of show matches, with each one using a more complete version of a game and more skilled human opponents. This culminated in what's expected to be the final show match over the weekend, when OpenAI Five was pitted in a best-of-three match against OG, the team that won the biggest competition in all of esports last year, The International.

OpenAI is subject to a few handicaps in the name of keeping things interesting. Each of its five AI players is running an identical version of the bot software, with no communication among them: they're five independent players who happen to think very alike but have no direct means of coordinating their actions. OpenAI's reaction time is artificially slowed down to ensure that the game isn't simply a showcase of superhuman reflexes. And the bot still isn't using the full version of the game: only a limited selection of heroes is available, and items that create controllable minions or illusions are banned because it's felt that the bot would be able to micromanage its minions more effectively than any human could.

Read 9 remaining paragraphs | Comments

Posted in Artificial intelligence, cloud, dota 2, Gaming & Culture, machine learning, OpenAI, Tech | Comments (0)

Clippy briefly resurrected as Teams add-on, brutally taken down by brand police

March 22nd, 2019
Clippy briefly resurrected as Teams add-on, brutally taken down by brand police

Enlarge (credit: theaelix)

On Microsoft's official Office GitHub repository (which contains, alas, not the source code to Office itself but lots of developer content for software that extends Office), the widely loved (?) Clippy made a brief appearance with the publication of a Clippy sticker pack for Microsoft Teams. Teams users could import the stickers and use them to add pictures of a talking paperclip to their conversations.

The synergy between the two seems obvious. With its various machine learning-powered services and its bot development framework, Microsoft finally has the technology to make Clippy the assistant we always wanted him to be: a Clippy that can be asked natural language questions, that we can actually speak to and that can talk back to us, that can recognize us by sight and greet us as we sit down to the working day. Teams, an interface that's conversational and text heavy, is the perfect venue for a new Clippy compliant with all the buzzwords of the late twenty-teens. Twenteens? Whatever.

Clippy is, after all, far more expressive than Cortana. While Clippy and Cortana share a tendency to reshape their basic form to meet the needs of the task at hand—Clippy can distort itself into a question mark or an envelope or whatever, and Cortana can deviate from her usual circular form—Clippy has a killer advantage in that it has eyes, and more particularly, eyebrows, enabling a range of emotions such as incredulity and contemptuous pity that Cortana can only dream of.

Read 1 remaining paragraphs | Comments

Posted in clippy, cloud, Fun, machine learning, microsoft, paperclip, Teams, Tech | Comments (0)

Microsoft’s latest security service uses human intelligence, not artificial

February 28th, 2019
Microsoft security experts monitoring the world, looking for hackers.

Enlarge / Microsoft security experts monitoring the world, looking for hackers. (credit: Microsoft)

Microsoft has announced two new cloud services to help administrators detect and manage threats to their systems. The first, Azure Sentinel, is very much in line with other cloud services: it's dependent on machine learning to sift through vast amounts of data to find a signal among all the noise. The second, Microsoft Threat Experts, is a little different: it's powered by humans, not machines.

Azure Sentinel is a machine learning-based Security Information and Event Management that takes the (often overwhelming) stream of security events—a bad password, a failed attempt to elevate privileges, an unusual executable that's blocked by anti-malware, and so on—and distinguishes between important events that actually deserve investigation and mundane events that can likely be ignored.

Sentinel can use a range of data sources. There are the obvious Microsoft sources—Azure Active Directory, Windows Event Logs, and so on—as well as integrations with third-party firewalls, intrusion-detection systems, endpoint anti-malware software, and more. Sentinel can also ingest any data source that uses ArcSight's Common Event Format, which has been adopted by a wide range of security tools.

Read 5 remaining paragraphs | Comments

Posted in azure, cloud, enterprise, machine learning, microsoft, security, Tech, windows defender advanced threat protection | Comments (0)

Twenty minutes into the future with OpenAI’s Deep Fake Text AI

February 27th, 2019
Twenty minutes into the future with OpenAI’s Deep Fake Text AI

Enlarge (credit: Max Headroom / Aurich)

In 1985, the TV film Max Headroom: 20 Minutes into the Future presented a science fictional cyberpunk world where an evil media company tried to create an artificial intelligence based on a reporter's brain to generate content to fill airtime. There were somewhat unintended results. Replace "reporter" with "redditors," "evil media company" with "well meaning artificial intelligence researchers," and "airtime" with "a very concerned blog post," and you've got what Ars reported about last week: Generative Pre-trained Transformer-2 (GPT-2), a Franken-creation from researchers at the non-profit research organization OpenAI.

Unlike some earlier text-generation systems based on a statistical analysis of text (like those using Markov chains), GPT-2 is a text-generating bot based on a model with 1.5 billion parameters. (Editor's note: We recognize the headline here, but please don't call it an "AI"—it's a machine-learning algorithm, not an android). With or without guidance, GPT-2 can create blocks of text that look like they were written by humans. With written prompts for guidance and some fine tuning, the tool could be theoretically used to post fake reviews on Amazon, fake news articles on social media, fake outrage to generate real outrage, or even fake fiction, forever ruining online content for everyone. All of this comes from a model created by sucking in 40 gigabytes of text retrieved from sources linked by high-ranking Reddit posts. You can only imagine how bad it would have been if the researchers had used 40 gigabytes of text from 4chan posts.

After a little reflection, the research team has concerns about the policy implications of their creation. Ultimately, OpenAI's researchers kept the full thing to themselves, only releasing a pared-down 117 million parameter version of the model (which we have dubbed "GPT-2 Junior") as a safer demonstration of what the full GPT-2 model could do.

Read 28 remaining paragraphs | Comments

Posted in Biz & IT, Features, machine learning, text generation | Comments (0)

Researchers, scared by their own work, hold back “deepfakes for text” AI

February 15th, 2019
This is fine.

Enlarge / This is fine.

OpenAI, a non-profit research company investigating "the path to safe artificial intelligence," has developed a machine learning system called Generative Pre-trained Transformer-2 (GPT-2 ), capable of generating text based on brief writing prompts. The result comes so close to mimicking human writing that it could potentially be used for "deepfake" content. Built based on 40 gigabytes of text retrieved from sources on the Internet (including "all outbound links from Reddit, a social media platform, which received at least 3 karma"), GPT-2 generates plausible "news" stories and other text that match the style and content of a brief text prompt.

The performance of the system was so disconcerting, now the researchers are only releasing a reduced version of GPT-2 based on a much smaller text corpus. In a blog post on the project and this decision, researchers Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever wrote:

Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT-2 along with sampling code. We are not releasing the dataset, training code, or GPT-2 model weights. Nearly a year ago we wrote in the OpenAI Charter: “we expect that safety and security concerns will reduce our traditional publishing in the future, while increasing the importance of sharing safety, policy, and standards research,” and we see this current work as potentially representing the early beginnings of such concerns, which we expect may grow over time. This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas.

OpenAI is funded by contributions from a group of technology executives and investors connected to what some have referred to as the PayPal "mafia"—Elon Musk, Peter Thiel, Jessica Livingston, and Sam Altman of YCombinator, former PayPal COO and LinkedIn co-founder Reid Hoffman, and former Stripe Chief Technology Officer Greg Brockman. Brockman now serves as OpenAI's CTO. Musk has repeatedly warned of the potential existential dangers posed by AI, and OpenAI is focused on trying to shape the future of artificial intelligence technology—ideally moving it away from potentially harmful applications.

Read 6 remaining paragraphs | Comments

Posted in AI, artificial intellignece, Biz & IT, computer-generated text, deep fake, deepfake, fake news, machine learning, Markov chain | Comments (0)

Mozilla to use machine learning to find code bugs before they ship

February 12th, 2019

Ubisoft's Commit-Assistant

In a bid to cut the number of coding errors made in its Firefox browser, Mozilla is deploying Clever-Commit, a machine-learning-driven coding assistant developed in conjunction with game developer Ubisoft.

Clever-Commit analyzes code changes as developers commit them to the Firefox codebase. It compares them to all the code it has seen before to see if they look similar to code that the system knows to be buggy. If the assistant thinks that a commit looks suspicious, it warns the developer. Presuming its analysis is correct, it means that the bug can be fixed before it gets committed into the source repository. Clever-Commit can even suggest fixes for the bugs that it finds. Initially, Mozilla plans to use Clever-Commit during code reviews, and in time this will expand to other phases of development, too. It works with all three of the languages that Mozilla uses for Firefox: C++, JavaScript, and Rust.

The tool builds on work by Ubisoft La Forge, Ubisoft's research lab. Last year, Ubisoft presented the Commit-Assistant, based on research called CLEVER, a system for finding bugs and suggesting fixes. That system found some 60-70 percent of buggy commits, though it also had a false positive rate of 30 percent. Even though this false positive rate is quite high, users of this system nonetheless felt that it was worthwhile, thanks to the time saved when it did correctly identify a bug.

Read 3 remaining paragraphs | Comments

Posted in bugs, C#, development, machine learning, Mozilla, Programming, Tech, Ubisoft | Comments (0)

Yes, “algorithms” can be biased. Here’s why

January 24th, 2019
Seriously, it's enough to make researchers cry.

Enlarge / Seriously, it's enough to make researchers cry. (credit: Getty | Peter M Fisher)

Dr. Steve Bellovin is professor of computer science at Columbia University, where he researches "networks, security, and why the two don't get along." He is the author of Thinking Security and the co-author of Firewalls and Internet Security: Repelling the Wily Hacker. The opinions expressed in this piece do not necessarily represent those of Ars Technica.

Newly elected Rep. Alexandria Ocasio-Cortez (D-NY) recently stated that facial recognition "algorithms" (and by extension all "algorithms") "always have these racial inequities that get translated" and that "those algorithms are still pegged to basic human assumptions. They're just automated assumptions. And if you don't fix the bias, then you are just automating the bias."

She was mocked for this claim on the grounds that "algorithms" are "driven by math" and thus can't be biased—but she's basically right. Let's take a look at why.

Read 23 remaining paragraphs | Comments

Posted in AI, algorithms, machine learning, ML, Policy | Comments (0)