SpinVox Shares Details of its World-Beating Speech Technology Breakthroughs
Written by SpinVox
Monday, 27 July 2009
"We're in the last mile of solving the problem of reliable automatic speech conversion," says Daniel Doulton.
SpinVox Voice Message Conversion System (VMCST) has become so advanced and efficient that it has reduced its use of QC agents by 98 per cent in just two years.
SpinVox VMCS contains two billion words and phrases and knows over 99 per cent of the words you are likely to say.
Quality Control agents train VMCS to recognise new or difficult words and phrases.
London, UK - 26 July 2009.
SpinVox, the global leader in voice to content messaging, today took the opportunity to reveal some of the details about the technological breakthroughs it has achieved in the development of its Voice Message Conversion System (VMCST).
VMCS system
The core technology around which the VMCS is built is unique to the speech market, so there is little to which it can be compared. It is based on world-leading breakthoughs in automatic speech recognition (ASR) combined with artificial intelligence, semantics and natural linguistics which have been developed by SpinVox Cambridge-based Advanced Speech Group (ASG).
VMCS already knows more than 99 per cent of anything a user is likely to say. It contains over two billion words and phrases derived from the equivalent of 72 years of audio training - making it the world's largest corpus of spoken language. This knowledge system is growing at an ever-increasing rate and is further accelerating SpinVox ability to automate.
Dr Tony Robinson, a peer of Prof. Philip Woodland of the Cambridge University Machine Intelligence Laboratory, leads the SpinVox ASG of more than 20 speech specialists and PhDs who are working continually to develop and refine the system.
Professor Woodland explains: "Much of this technology derives from the world-leading research undertaken by my group at Cambridge University Machine Intelligence Laboratory. I am a consultant to SpinVox and its Automatic Speech Group and their unique approach allows the automatic system to be exposed to huge amounts of spoken data, from which highly accurate acoustic and language models can be built."
This combined group of technologies and processes is the main engine for converting voice messages to text, rather than human intervention.
Having experimented with purely automatic speech conversion, SpinVox decided early on in its development that because its voice to text service converts real-life, dynamic and fast-evolving language and messages that we use and exchange every day (known in the industry as `free form speech'), it was essential that the system had the capability to evolve at the same rate, converting the latest words, phrases, brand names and colloquialisms to ensure a high level of accuracy. This is why it describes the system as `live-learning'.
SpinVox realised that only by combining its rapidly evolving state-of-the art technology with human quality control and training, could it create a system which could complete elements of messages which could not be automatically converted. As a result, the system constantly learns new words and phrases, making it increasingly efficient and reliable.
Explains SpinVox CIO Rob Wheatley: "Quality Control agents are an important part of the SpinVox service because their constant minute-by-minute input actually improves the quality of text conversions in a process we call `live learning`. The technology is a bit like a human brain, in that, the more it is exposed to input, the more it learns.
"This process has helped us improve our accuracy massively. Since its inception in 2007, the technology has improved to the extent that the system requires only two per cent of the input it required just two years ago and can even now predict more than 99 per cent of what most people speaking in English or Spanish will say next. Or to put it another way, in just two years, we have reduced the requirement for human intervention to just a few hundred agents per market compared to the thousands per market when we started. Our world-class speech scientists in the Advanced Speech Group have helped make this system unchallenged in terms of accuracy, speed and reliability."
Privacy
As discussed above, SpinVox VMCS learns from human intervention. This is the reason SpinVox works with five, world-class, call centres which have been chosen after SpinVox put around 50 call centres through its stringent quality control and security procedures.
Every message is dealt with initially by the automated system. Only in cases where speech is too indistinct to be dealt with by the system, or contains unfamiliar or new words or phrases, is the completely anonymised and encrypted message sent to a QC agent for help. The agents will only ever see the messages that need input and do not know how many other messages have been converted, processed and sent automatically by VMCS.
SpinVox is fully compliant with industry standards relating to the processing of information, including the Data Protection Act 1998. To this end, any part of a message seen by the Quality Control centres is anonymous, encrypted and randomised, meaning that it is impossible to determine where the messages are from or where they are going to. SpinVox has achieved two prestigious ISO qualifications: ISO 27001 (the international Information Security Standard) and ISO 9001:2008 quality certification.
Jaime Tronqued is President of ScopeWorks Asia, Inc., a Quality Control house which handles private and secure customer services for companies in telecommunications, travel, banking, healthcare and insurance.
Scopeworks has been working with SpinVox for the past two years and Tronqued says: "I can categorically assure people that SpinVox messages are both private and secure. There are many layers of security and privacy that are used to ensure this and SpinVox was extremely thorough in its audit of our operations, our security and our privacy procedures as they ran a training pilot with our Quality Control agents on test conversion messages, prior to contracting with us to deliver a live customer service using encrypted and anonymised messages. "
Adds Ragindra Persaud, CEO of NPIC, a South American-based Quality Control house, which has also been working with SpinVox for the past two years. "SpinVox went through a thorough review of our processes and procedures, and it was not until they were fully satisfied that we comply with their stringent requirements, that they contracted with us officially. In the live customer system there is no way for any Quality Control agent to know where a message has come from or to whom it is being sent, nor copy or abuse this."
SpinVox - now and in the future
SpinVox has achieved enormous success in the past and intends to achieve even greater success in the future. It currently has 30 million users worldwide and will be converting voice to text services for over 100 million users by the end of 2009. It is a British success story based on breakthroughs in technology that have established a new class of automated speech conversion.
Says co-founder Christina Domecq: "We have spent five years working very hard, building up a solid foundation for this business. It's something of which we are very proud, whether it's our technology, our customer group, subscriber base or investors. Like every business, we have to deal with challenges, particularly in the middle of a credit crunch, but we are in a strong position, are growing fast and are looking to the future with confidence. We are a business that is founded on a long-term vision and for that reason we do everything by the book, which includes a very thorough approach to security and due diligence."
Adds co-founder Daniel Doulton: "We knew when we started SpinVox that the time was right for this kind of breakthrough. We had to take this innovative approach, because leading commercial speech technology at the time was unable to deliver a reliable experience. We've developed our own, data-driven system which works on a 'meaning' (i.e. semantic basis) to solve the automation problem. To help achieve this, we have recruited the best speech scientists from Cambridge and abroad to build a team of more than 20 speech specialists and PhDs. Now, SpinVox is leading the market, having created a whole new category of `speech as a service'. We've already signed 28 operators in the two short years we've been selling to carriers, and there are more deals to come."
Explains Julie Meyer, CEO, Ariadne Capital, an early investor in SpinVox: "The company's genius is how it integrates its technology with human intelligence, which is still the only thing that machines can learn from, to do virtually real-time voice-to-text conversion.
"SpinVox has built up huge value in terms of its intellectual property, customer relationships, management and revenue growth. Because they don't sell through UK mobile operators people tend to forget just how successful they are continuing to be in the rest of the world. To close a deal in the past few months with Telefonica, for instance, and subsequently roll-out the service to 13 countries across Latin America is an amazing feat that we in the UK should feel proud of."
Mobile Technology Feature - Top 10 Business Apps for Android - Android has become one of the most popular mobile operating systems
in the world due to advanced software, competitive manufacturers, and an
app market that is filled with exciting and useful applications. Read
on for this fine Top 10 List compiled by tech writer, Blake Sanders
The Really Big List of Mapping, Geo location Mobile developer resources, APIs and Tools - The Mobile Ecosystem... no question that mobile is huge and
the opportunity for developers, in particular, geo developers! There's
loads of developer resources on the web so to help the developer,
here's a listing of just some of the fine dev resources, blogs,
toolsets and more to consider when developing or porting your app to a
smartphone or Tablet.
ArcGIS.com Mashup Tutorial with Flickr and Foursquare data-
Here's a simple "how to" tutorial on working with the free ArcGIS.com
to easily import KML or RSS data from Flickr and foursquare into ArcGIS
to create your own custom map mashup. ArcGIS.com has some recent
updates including support for KML, be sure to check it out!
Most Users of Free Photo Apps Say Adding Cool Effects is Most Useful - Adding cool effects
to photos was rated as the most useful thing by more than halfof
consumers surveyed using free online photography services, according to
CatchFree a free
online service that helps people find the best free mobile and web applications
to perform useful tasks.
Amazon Kindle Fire Tablet Up Close - No Replacement for the iPad But Pretty Darned Sweet - As could be
expected, speculation and blogging is rampant with many stories and
pseudo "reviews" coming out, many of them touting the next "iPad
killer".
Mobile Developer Resource - MapQuest & The Mobile Worker - Facility management companies, surveyors, multi-level marketing,
insurance claims, pipeline companies, water utilities; all have field
workers who would benefit from mobile applications. Not only checking in
to work sites, but keeping a record of the work done
Flickr GPS Photo Tag Tip - Here's a tip for Flickr users who want to share their GPS information
with their photos. note, in order to do this you'll need to be
capturing photos using a GPS-enabled or GPS-aware camera (see you're
smartphone hardware settings to ensure that GPS or location sharing is
turned "ON)".
Mobile data usage in Canada to triple between 2010 and 2012 - Some interesting, yet not terribly surprising, numbers on mobile
usage coming out of Canada. According to a report summarized on
Techvibes, mobile usage is soaring, in particular, the smartphone
segment
Social Location Directory
Do you have a location-aware application, social location service or other location-sensitive service that you'd like to share with us? Please browse this growing directory of Social location Services and feel free to leave a link to your favorite service! See the Directory HERE
Suggested Reading
The Underground Guide To The iPhone:
The iPhone is – if I may say so – one of the greatest mobile
revolutions of the past decade. More and more, mobile phones seem to
materialize out of our wildest dreams. Because of the tight integration
of third-party applications, you can do nearly everything with your
device — be it gaming, working, fooling around, and of course phoning.
Good Mobile Messaging: Executives and professional field forces spend an increasing amount of time on the road tending to business.
Mobile Apps - Native or Web?:
You probably have a plan to build a mobile app, but you've struggled
with the basic question - Should I build it using open web standards
such as HTML, CSS, and JavaScript, or should I build it as a native app
for the devices I want to target?
iBike Coach App- a new “Always On” feature always gathers data
Nike+ GPS App- for sports tracking, recording and sharing
Featured Events
Content & Apps for Automotive Europe 2012 (18-19 April, Germany)
will give you key insights on how in-vehicle connectivity is being
revolutionised as OEMs integrate dynamic content to stay relevant
throughout the vehicle life cycle.
Location Intelligence for Enterprise - Europe 2012, January 16-17, London Hear from Google, Jones Lang LaSalle, Tesco, Willis, Oracle,
ESRI, Microsoft, PBBI and many more as they debate the best strategies
for business models and integration.
Consumer Telematics Show 2012 (Jan 9, Mandalay Bay, Las Vegas, USA) is the most prestigious and
dynamic consumer telematics focused meeting for the automotive industry.
Held in Las Vegas on the eve of the Consumer Electronics Show (CES),
this year's senior-level speakers will include General Motors, Audi,
Toyota, Panasonic, Continental, Hughes Telematics, Gartner, KIA,
Mercedes-Benz and many more
Esri Dev Summit - The
Esri Developer Summit (DevSummit) brings together developers and GIS
professionals from all over the globe. March 26-29, 2012, Palm Springs,
CA