US8490131B2 - Automatic capture of data for acquisition of metadata - Google Patents

Automatic capture of data for acquisition of metadata Download PDF

Info

Publication number
US8490131B2
US8490131B2 US12/590,259 US59025909A US8490131B2 US 8490131 B2 US8490131 B2 US 8490131B2 US 59025909 A US59025909 A US 59025909A US 8490131 B2 US8490131 B2 US 8490131B2
Authority
US
United States
Prior art keywords
television receiver
metadata
receiver device
audio data
display
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US12/590,259
Other versions
US20110102684A1 (en
Inventor
Nobukazu Sugiyama
Jaime Chee
Ted Dunn
Utkarsh Pandya
Ling Jun Wong
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Sony Electronics Inc
Original Assignee
Sony Corp
Sony Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp, Sony Electronics Inc filed Critical Sony Corp
Priority to US12/590,259 priority Critical patent/US8490131B2/en
Assigned to SONY ELECTRONICS INC., SONY CORPORATION reassignment SONY ELECTRONICS INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DUNN, TED, PANDYA, UTKARSH, SUGIYAMA, NOBUKAZU, CHEE, JAIME, WONG, LING JUN
Publication of US20110102684A1 publication Critical patent/US20110102684A1/en
Application granted granted Critical
Publication of US8490131B2 publication Critical patent/US8490131B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4392Processing of audio elementary streams involving audio buffer management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4341Demultiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • H04N21/8113Monomedia components thereof involving special audio data, e.g. different tracks for different languages comprising music, e.g. song in MP3 format
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry for the reception of television signals according to analogue transmission standards
    • H04N5/60Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals
    • H04N5/602Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals for digital sound signals

Definitions

  • CD compact disc
  • FIG. 1 is an example system diagram of an example TV system consistent with certain embodiments of the present invention.
  • FIG. 2 depicts an example TV display consistent with certain embodiments of the present invention.
  • FIG. 3 is an example block diagram of a TV receiver device consistent with certain embodiments of the present invention.
  • FIG. 4 is a flow chart of an example process consistent with certain embodiments of the present invention.
  • FIG. 6 is another flow chart of an example process for determining if audio contains music attributes consistent with certain embodiments of the present invention.
  • the terms “a” or “an”, as used herein, are defined as one or more than one.
  • the term “plurality”, as used herein, is defined as two or more than two.
  • the term “another”, as used herein, is defined as at least a second or more.
  • the terms “including” and/or “having”, as used herein, are defined as comprising (i.e., open language).
  • the term “coupled”, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically.
  • program or “computer program” or similar terms, as used herein, is defined as a sequence of instructions designed for execution on a computer system.
  • a “program”, or “computer program”, may include a subroutine, a function, a procedure, an object method, an object implementation, in an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
  • the term “processor”, “controller”, “CPU”, “Computer” and the like as used herein encompasses both hard programmed, special purpose, general purpose and programmable devices and may encompass a plurality of such devices or a single device in either a distributed or centralized configuration without limitation.
  • program may also be used in a second context (the above definition being for the first context).
  • the term is used in the sense of a “television program”.
  • the term is used to mean any coherent sequence of audio video content such as those which would be interpreted as and reported in an electronic program guide (EPG) as a single television program, without regard for whether the content is a movie, sporting event, segment of a multi-part series, news broadcast, etc.
  • EPG electronic program guide
  • the term may also be interpreted to encompass commercial spots and other program-like content which may not be reported as a program in an electronic program guide.
  • the TV while the TV plays audio or video, it also continuously captures the audio or video data and sends a sample of the data to a content identification server in order to to acquire the metadata of content being played. Since the metadata is being acquired from server continuously (or on user demand) the device can show the metadata on the TV display.
  • the television device can hold the data using a circular buffer so that when user initiates acquisition of the metadata at any time and the most recent six seconds of valid audio will be sent.
  • the TV device already has the music sample data and can send the sample data to the server immediately rather than having to first capture the data and then send the data to the content identification server and encounter undesirable delays. This speeds up the process by at least about six seconds according to present sample requirements.
  • the TV decodes and displays a TV shows which are received over a digital channel. 2. While TV decodes the digital video and audio, it captures decoded audio data, which may be pulse code modulated (PCM) data. 3. Once the TV captures six seconds of audio, a check can be performed to see if the six second sample is valid audio, and if so, send the data to a content identification server—in this case, a music identification server. 4. If the metadata which was received from music identification server was valid, then the TV can show the metadata on the screen so that the music is identified to the user. 5. Add the song metadata into database with current TV program data. 6. User launches song history, and is able to see all song history with related TV program data.
  • PCM pulse code modulated
  • FIG. 1 is an example system diagram of an example TV system consistent with certain embodiments of the present invention.
  • the system includes a TV monitor 10 that displays TV programs in a more or less conventional manner under control by the user generally using a remote controller 14 .
  • the TV monitor (or other connected receiver such as a TV set top box or other Internet enabled appliance) is connected via the Internet 18 to a Content Identification Server 22 such as those available commercially from content identification services such as Gracenote as previously discussed.
  • the Internet connection can also be used by the TV receiver system to obtain electronic program guide (EPG) data from an online EPG server such as 26 .
  • EPG electronic program guide
  • the metadata for a particular song that is playing on the TV when the metadata for a particular song that is playing on the TV is identified, it can be displayed on the TV monitor 10 as depicted in FIG. 2 . As shown, the TV program plays normally, and the metadata in the form of song title and artist is displayed as overlay graphics 30 . Any other form of display of the metadata can be used including transparent windows and the like, and other metadata including album and further artist information or album graphics can be utilized. It will be understood that the illustration of FIG. 2 is merely intended as one illustration of the possibilities and not intended to be limiting.
  • FIG. 3 is an example partial block diagram of a TV receiver device 50 consistent with certain embodiments of the present invention, wherein only the functional blocks used to implement the present functionality is depicted.
  • a receiver 54 or other data source or data interface receives digital A/V content from a source such as a cable network, satellite network, Telco, broadcaster or via the Internet and produces an output data transport stream carrying packetized audio, video and support data (e.g., PSIP data and the like).
  • packets are filtered (e.g., by packet identifier) and stripped of all packet overhead using a packet filter 58 in order to produce a qualified stream of compressed PCM audio data (e.g., using SDK compression or other standard specified by the content identification service) suitable for use by the content identification server.
  • a qualified stream of compressed PCM audio data e.g., using SDK compression or other standard specified by the content identification service
  • Such format is specified by the content identification service so as to standardize the data received for identification.
  • this audio data can be continuously stored in a six second (or other suitable time) circular buffer 62 that permits the data to be immediately accessed for transmission to the content identification server 22 via modem 66 that is provided with a URI suitable for access to the server 22 via Internet 18 .
  • the content identification server 22 returns metadata if it can identify the song and this metadata are parsed in a metadata parser 70 , which may operate as a process running on the system control processor 74 .
  • the parsed metadata can then be passed to a graphics overlay processor 78 for rendering to display driver 82 and TV monitor display 86 .
  • control processor 74 which has associated storage (RAM, ROM, etc.) 90 , with control processor 74 interfacing with each functional block as required and receiving commands from remote controller 14 via remote control command receiver 94 in any suitable manner.
  • FIG. 4 is a flow chart of a first example process 100 consistent with certain embodiments of the present invention starting at 104 after which the system determines if the audio capture feature described is turned on at 108 . If not, the remainder of the process can be defeated until the feature is turned on. If the capture feature is turned on at 108 , the circular buffer begins capturing audio at 112 and continuously updates the audio so that at least six seconds (for the Gracenote implementation) of audio is always available. If the user determines that there is a song of interest playing as the audio at 116 , then he or she may wish to identify the song.
  • a request can be made to the server at 124 to identify the song. This can occur in one embodiment as a result of a user request at 120 (shown in dashed lines to indicate that it is optional), or as a matter of course wherein the server query can be carried out using any other mechanism (for example, a time criteria of every 90 seconds or so).
  • the request at 124 can include conversion of the six seconds of audio into a smaller sized file representing a signature of the music sample that is used to identify the music.
  • a failure message may be briefly presented as a screen overlay at 132 .
  • the song metadata may be stored to memory at 136 .
  • the song metadata can be added to a database with or without current TV program data.
  • the song metadata can also be added to a song history list so that the user can launch a song history list command, and thereby retrieve from memory a list of songs identified with or without related TV program data.
  • the storage as a database or list is depicted as 140 and this data are then accessible by user command either directly from a remote control or from a TV menu system such as a cross-media bar menu system.
  • the parsed metadata can be displayed to an overlay window or otherwise on the TV screen at 148 so that the user can identify relevant information about the song such as title, artist, author, album, etc. or even display an icon obtained from the content identification database representing album graphics.
  • relevant information about the song such as title, artist, author, album, etc. or even display an icon obtained from the content identification database representing album graphics.
  • FIG. 5 shows one such variation with all operations being essentially the same as described in connection with FIG. 4 .
  • the system requests the metadata without user prompting. Again, this could be carried out on a periodic basis or whenever a song is determined to be playing because the sample has song attributes.
  • the metadata are already stored in memory and is not displayed until the user requests song info at 120 . The information, in this case, can often be displayed immediately since it will have been previously requested and received.
  • 120 can be omitted as indicated by the broken lines and any time a song is identified its metadata can be displayed.
  • the analysis of the song to determine if it has song attributes is preferably done at the audio decoder level and possibly using hardware acceleration. This can be accomplished in a number of ways.
  • One example is depicted in FIG. 6 starting at 200 where first the audio stream is tested to see if it is a flat frequency signal at 204 , which may be a monotone or a pretty much silent stream. If so, the audio is unlikely to be a song and/or is unsuitable to produce a signature and control returns to 204 for the next analysis. If the audio stream passes the test, the audio stream passed through a set of filters at 208 .
  • One such filter will be similar and repetitive frequencies as such frequencies often resembling noise in the TV program or commercial are detected.
  • Another such filter is a set of frequencies corresponding to normal conversational human speech, which represents a typical news reporting situation. If after such filtering at 208 the resultant audio stream is still not pretty much flat in frequency response (i.e., it has distinct high and low frequencies), it is relatively certain that there is music in the audio stream at 212 .
  • the original audio stream (or its compressed signature) can then be sent to the content identification server at 216 . Also, it is noted that since content identification is user-initiated, one can expect that the past 6 sec of audio stream has something worth identifying, and hence send it up to the content identification server without a music check. The server will merely return a negative result if the audio stream is not a recognized as piece of music.
  • the audio streams can be compressed into signature files which the content identification server recognizes and expects.
  • the corresponding audio information can be tagged (as to whether it is a song and the time stamp (to the signature files.)
  • a television receiver device consistent with certain embodiments has a display associated with the television receiver device.
  • a filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of PCM digital audio data, wherein the filter separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets.
  • a circular buffer stores a sample of approximately six seconds in duration of the digital audio data.
  • a modem transmits the sample of audio data from the buffer to a content identification server and that receives metadata identifying the audio data from the content identification server.
  • a storage device stores the metadata to a database or history list.
  • a metadata parser selects predefined elements of metadata from the metadata for display on the display.
  • a graphics overlay processor generates a graphics overlay containing the metadata. Either the sample of audio data is sent to the content identification server upon receipt of a user command from a remote control or the metadata is rendered to the display upon receipt of a user command from a remote control.
  • Another television receiver device has a display associated with the television receiver device.
  • a filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data.
  • a buffer stores a sample of the digital audio data.
  • a modem transmits the sample of audio data from the buffer to a content identification server and that receives metadata identifying the audio data from the content identification server.
  • a display processor renders at least a portion of the metadata to the display.
  • the digital audio data comprises Pulse Code Modulated (PCM) audio data.
  • the buffer comprises a circular buffer that continuously stores a defined quantity of audio data.
  • the defined quantity of audio data comprises approximately six seconds of audio.
  • the filter separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets.
  • a metadata parser selects predefined elements of metadata from the metadata for display on the display.
  • the display processor comprises a graphics overlay processor that generates a graphics overlay containing the metadata.
  • the sample of audio data is sent to the content identification server upon receipt of a user command from a remote control.
  • the metadata is rendered to the display upon receipt of a user command from a remote control.
  • the metadata are stored to a database or history list on a storage device.
  • a television receiver device consistent with certain implementations has a display associated with the television receiver device.
  • a filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital content of a specified category.
  • a buffer stores a sample of the digital content.
  • a modem transmits the sample of digital content from the buffer to a content identification server and that receives metadata identifying the digital content from the content identification server.
  • a display processor renders at least a portion of the metadata to the display.
  • the content comprises video content data.
  • the buffer comprises a circular buffer that continuously stores a defined quantity of the content.
  • the filter separates the specified category of content by use of packet identifiers to separate the specified category of content from other packets and produces the stream of digital content by removing packet overhead from packets of the specified category.
  • a metadata parser selects predefined elements of metadata from the metadata for display on the display.
  • the display processor comprises a graphics overlay processor that generates a graphics overlay containing the metadata.
  • the sample of content is sent to the content identification server upon receipt of a user command from a remote control.
  • the metadata is rendered to the display upon receipt of a user command from a remote control.
  • the metadata are stored to a database or history list on a storage device.
  • a method of rendering content metadata to a display associated with a television receiver device in a manner consistent with certain embodiments involves providing a display associated with the television receiver device; converting a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data; buffering a sample of the digital audio data in a buffer storage device; transmitting the sample of audio data from the buffer to a content identification server; receiving metadata identifying the audio data from the content identification server; and rendering at least a portion of the metadata to the display.
  • the digital audio data comprises Pulse Code Modulated (PCM) audio data.
  • the buffering is carried out in a circular buffer that continuously stores a defined quantity of audio data.
  • the defined quantity of audio data comprises approximately six seconds of audio.
  • the filtering separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets.
  • the method further involves parsing the metadata to select predefined elements of metadata from the metadata for display on the display.
  • the predefined elements of metadata are rendered to the display using a graphics overlay processor that generates a graphics overlay containing the metadata.
  • the sample of audio data is sent to the content identification server upon receipt of a user command from a remote control.
  • the metadata is rendered to the display upon receipt of a user command from a remote control.
  • the method further involves storing the metadata to a database or history list on a storage device. In certain implementations, prior to transmitting a determination can be made as to whether the audio sample has song attributes.
  • Any of the above methods can be carried out using a tangible computer readable electronic storage medium storing instructions which, when executed on one or more programmed processors, carry out the method.
  • circuit functions are carried out using equivalent executed on one or more programmed processors.
  • General purpose computers, microprocessor based computers, micro-controllers, optical computers, analog computers, dedicated processors, application specific circuits and/or dedicated hard wired logic and analog circuitry may be used to construct alternative equivalent embodiments.
  • Other embodiments could be implemented using hardware component equivalents such as special purpose hardware and/or dedicated processors.
  • Certain embodiments may be implemented using a programmed processor executing programming instructions that in certain instances are broadly described above in flow chart form that can be stored on any suitable electronic or computer readable storage medium (such as, for example, disc storage, Read Only Memory (ROM) devices, Random Access Memory (RAM) devices, network memory devices, optical storage elements, magnetic storage elements, magneto-optical storage elements, flash memory, core memory and/or other equivalent volatile and non-volatile storage technologies) and/or can be transmitted over any suitable electronic communication medium.
  • ROM Read Only Memory
  • RAM Random Access Memory
  • network memory devices such as, for example, disc storage, Read Only Memory (ROM) devices, Random Access Memory (RAM) devices, network memory devices, optical storage elements, magnetic storage elements, magneto-optical storage elements, flash memory, core memory and/or other equivalent volatile and non-volatile storage technologies

Abstract

A television receiver device consistent with certain implementations has a display associated with the television receiver device. A filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data. A buffer stores a sample of the digital audio data. A modem transmits the sample of audio data from the buffer to a content identification server and that receives metadata identifying the audio data from the content identification server. A display processor renders at least a portion of the metadata to the display. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract.

Description

COPYRIGHT AND TRADEMARK NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. Trademarks are the property of their respective owners.
BACKGROUND
Companies such as Gracenote, Inc. are in the content identification business. They have technology that can be used to provide metadata that identifies, for example, songs, artists, albums, etc. from samples of the songs that are sent to their servers. This process is used commercially by numerous manufacturers of hardware and software players such as compact disc (CD) players.
BRIEF DESCRIPTION OF THE DRAWINGS
Certain illustrative embodiments illustrating organization and method of operation, together with objects and advantages may be best understood by reference detailed description that follows taken in conjunction with the accompanying drawings in which:
FIG. 1 is an example system diagram of an example TV system consistent with certain embodiments of the present invention.
FIG. 2 depicts an example TV display consistent with certain embodiments of the present invention.
FIG. 3 is an example block diagram of a TV receiver device consistent with certain embodiments of the present invention.
FIG. 4 is a flow chart of an example process consistent with certain embodiments of the present invention.
FIG. 5 is another flow chart of another example process consistent with certain embodiments of the present invention.
FIG. 6 is another flow chart of an example process for determining if audio contains music attributes consistent with certain embodiments of the present invention.
DETAILED DESCRIPTION
While this invention is susceptible of embodiment in many different forms, there is shown in the drawings and will herein be described in detail specific embodiments, with the understanding that the present disclosure of such embodiments is to be considered as an example of the principles and not intended to limit the invention to the specific embodiments shown and described. In the description below, like reference numerals are used to describe the same, similar or corresponding parts in the several views of the drawings.
The terms “a” or “an”, as used herein, are defined as one or more than one. The term “plurality”, as used herein, is defined as two or more than two. The term “another”, as used herein, is defined as at least a second or more. The terms “including” and/or “having”, as used herein, are defined as comprising (i.e., open language). The term “coupled”, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically. The term “program” or “computer program” or similar terms, as used herein, is defined as a sequence of instructions designed for execution on a computer system. A “program”, or “computer program”, may include a subroutine, a function, a procedure, an object method, an object implementation, in an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system. The term “processor”, “controller”, “CPU”, “Computer” and the like as used herein encompasses both hard programmed, special purpose, general purpose and programmable devices and may encompass a plurality of such devices or a single device in either a distributed or centralized configuration without limitation.
The term “program”, as used herein, may also be used in a second context (the above definition being for the first context). In the second context, the term is used in the sense of a “television program”. In this context, the term is used to mean any coherent sequence of audio video content such as those which would be interpreted as and reported in an electronic program guide (EPG) as a single television program, without regard for whether the content is a movie, sporting event, segment of a multi-part series, news broadcast, etc. The term may also be interpreted to encompass commercial spots and other program-like content which may not be reported as a program in an electronic program guide.
Reference throughout this document to “one embodiment”, “certain embodiments”, “an embodiment”, “an example”, “an implementation” or similar terms means that a particular feature, structure, or characteristic described in connection with the embodiment, example or implementation is included in at least one embodiment, example or implementation of the present invention. Thus, the appearances of such phrases or in various places throughout this specification are not necessarily all referring to the same embodiment, example or implementation. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments, examples or implementations without limitation.
The term “or” as used herein is to be interpreted as an inclusive or meaning any one or any combination. Therefore, “A, B or C” means “any of the following: A; B; C; A and B; A and C; B and C; A, B and C”. An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.
As noted previously, companies such as Gracenote, Inc. are in the content identification business. They have technology that can be used to provide metadata that identifies, for example, songs, artists, albums, etc. from samples of the songs that are sent to their servers. This process is used commercially by numerous manufacturers of hardware and software players such as compact disc (CD) players. In general, Gracenote's algorithms for content (i.e., music) identification that runs on their content identification servers require a sample of six seconds of PCM audio. Accordingly, this example will be used throughout, but should not be considered limiting since other algorithms may be devised that use a shorter or longer sample.
In order to use this technology with a television, to identify a song within a television program, one can capture six seconds of audio and send the audio to the content identification server. However, this results in a delay which may be undesirable to some users. This problem can be overcome in the TV device by continuously capturing the audio (or video data) and sending the data to a content identification server to acquire the metadata of content being played.
It will be appreciated that while the present example depicts capturing audio data for identification of a song, similar processes can be carried out by capture of video, audio/video (A/V) or other content without limitation.
In certain instances, when a song is playing during a TV program, the viewer may wish to find out information about the song. Unfortunately, using known technology, the user cannot currently find the music metadata unless it forma a part of the TV program so that the metadata is shown on the screen as part of the TV program. While the metadata could be inserted into the television program's video signal, this requires modification for both receiver side and transmission side.
In accord with certain implementations consistent with the present invention, while the TV plays audio or video, it also continuously captures the audio or video data and sends a sample of the data to a content identification server in order to to acquire the metadata of content being played. Since the metadata is being acquired from server continuously (or on user demand) the device can show the metadata on the TV display.
Instead of sending the data to server continuously, the television device can hold the data using a circular buffer so that when user initiates acquisition of the metadata at any time and the most recent six seconds of valid audio will be sent. The TV device already has the music sample data and can send the sample data to the server immediately rather than having to first capture the data and then send the data to the content identification server and encounter undesirable delays. This speeds up the process by at least about six seconds according to present sample requirements.
Hence, in one example process: 1. The TV decodes and displays a TV shows which are received over a digital channel. 2. While TV decodes the digital video and audio, it captures decoded audio data, which may be pulse code modulated (PCM) data. 3. Once the TV captures six seconds of audio, a check can be performed to see if the six second sample is valid audio, and if so, send the data to a content identification server—in this case, a music identification server. 4. If the metadata which was received from music identification server was valid, then the TV can show the metadata on the screen so that the music is identified to the user. 5. Add the song metadata into database with current TV program data. 6. User launches song history, and is able to see all song history with related TV program data.
FIG. 1 is an example system diagram of an example TV system consistent with certain embodiments of the present invention. The system includes a TV monitor 10 that displays TV programs in a more or less conventional manner under control by the user generally using a remote controller 14. In order to obtain access to the functionality discussed herein, the TV monitor (or other connected receiver such as a TV set top box or other Internet enabled appliance) is connected via the Internet 18 to a Content Identification Server 22 such as those available commercially from content identification services such as Gracenote as previously discussed. The Internet connection can also be used by the TV receiver system to obtain electronic program guide (EPG) data from an online EPG server such as 26.
As described above, when the metadata for a particular song that is playing on the TV is identified, it can be displayed on the TV monitor 10 as depicted in FIG. 2. As shown, the TV program plays normally, and the metadata in the form of song title and artist is displayed as overlay graphics 30. Any other form of display of the metadata can be used including transparent windows and the like, and other metadata including album and further artist information or album graphics can be utilized. It will be understood that the illustration of FIG. 2 is merely intended as one illustration of the possibilities and not intended to be limiting.
FIG. 3 is an example partial block diagram of a TV receiver device 50 consistent with certain embodiments of the present invention, wherein only the functional blocks used to implement the present functionality is depicted. In this example, a receiver 54 or other data source or data interface (e.g., an Internet interface) receives digital A/V content from a source such as a cable network, satellite network, Telco, broadcaster or via the Internet and produces an output data transport stream carrying packetized audio, video and support data (e.g., PSIP data and the like). These packets are filtered (e.g., by packet identifier) and stripped of all packet overhead using a packet filter 58 in order to produce a qualified stream of compressed PCM audio data (e.g., using SDK compression or other standard specified by the content identification service) suitable for use by the content identification server. Such format is specified by the content identification service so as to standardize the data received for identification.
In accord with certain embodiments, this audio data can be continuously stored in a six second (or other suitable time) circular buffer 62 that permits the data to be immediately accessed for transmission to the content identification server 22 via modem 66 that is provided with a URI suitable for access to the server 22 via Internet 18. The content identification server 22 returns metadata if it can identify the song and this metadata are parsed in a metadata parser 70, which may operate as a process running on the system control processor 74. The parsed metadata can then be passed to a graphics overlay processor 78 for rendering to display driver 82 and TV monitor display 86.
The entire system operates under control of the system control processor 74 which has associated storage (RAM, ROM, etc.) 90, with control processor 74 interfacing with each functional block as required and receiving commands from remote controller 14 via remote control command receiver 94 in any suitable manner.
While this depiction shows storage of six seconds of audio, this is because a six second sample is used by Gracenote, Inc. for song recognition and is therefore shown in the example depicted above. But in general the sample should be long enough for recognition by any suitable content identification service. Further, the present concept can be extended to storage of a sample of video that is long enough for the video to be identified by content identification server 22, and for the system to then access EPG server 26 to obtain additional information regarding video information via video metadata. Other variations will occur to those skilled in the art upon consideration of the present teachings.
FIG. 4 is a flow chart of a first example process 100 consistent with certain embodiments of the present invention starting at 104 after which the system determines if the audio capture feature described is turned on at 108. If not, the remainder of the process can be defeated until the feature is turned on. If the capture feature is turned on at 108, the circular buffer begins capturing audio at 112 and continuously updates the audio so that at least six seconds (for the Gracenote implementation) of audio is always available. If the user determines that there is a song of interest playing as the audio at 116, then he or she may wish to identify the song.
Once the buffer has six seconds of music data stored, a request can be made to the server at 124 to identify the song. This can occur in one embodiment as a result of a user request at 120 (shown in dashed lines to indicate that it is optional), or as a matter of course wherein the server query can be carried out using any other mechanism (for example, a time criteria of every 90 seconds or so). The request at 124 can include conversion of the six seconds of audio into a smaller sized file representing a signature of the music sample that is used to identify the music.
If the content identification server is unable to identify the content and sends a failure notice instead of metadata at 128, a failure message may be briefly presented as a screen overlay at 132. However, if metadata are received at 128, the song metadata may be stored to memory at 136. In certain implementations, the song metadata can be added to a database with or without current TV program data. In certain implementations, the song metadata can also be added to a song history list so that the user can launch a song history list command, and thereby retrieve from memory a list of songs identified with or without related TV program data. The storage as a database or list is depicted as 140 and this data are then accessible by user command either directly from a remote control or from a TV menu system such as a cross-media bar menu system.
If an option of display of the metadata is turned on at 144, the parsed metadata can be displayed to an overlay window or otherwise on the TV screen at 148 so that the user can identify relevant information about the song such as title, artist, author, album, etc. or even display an icon obtained from the content identification database representing album graphics. Other variations will occur to those skilled in the art upon consideration of the present teachings.
FIG. 5 shows one such variation with all operations being essentially the same as described in connection with FIG. 4. However, in this example, when a sample is identified by the system as having song attributes, the system requests the metadata without user prompting. Again, this could be carried out on a periodic basis or whenever a song is determined to be playing because the sample has song attributes. In this example, however, the metadata are already stored in memory and is not displayed until the user requests song info at 120. The information, in this case, can often be displayed immediately since it will have been previously requested and received. In a further variant, 120 can be omitted as indicated by the broken lines and any time a song is identified its metadata can be displayed.
The analysis of the song to determine if it has song attributes is preferably done at the audio decoder level and possibly using hardware acceleration. This can be accomplished in a number of ways. One example is depicted in FIG. 6 starting at 200 where first the audio stream is tested to see if it is a flat frequency signal at 204, which may be a monotone or a pretty much silent stream. If so, the audio is unlikely to be a song and/or is unsuitable to produce a signature and control returns to 204 for the next analysis. If the audio stream passes the test, the audio stream passed through a set of filters at 208. One such filter will be similar and repetitive frequencies as such frequencies often resembling noise in the TV program or commercial are detected. Another such filter is a set of frequencies corresponding to normal conversational human speech, which represents a typical news reporting situation. If after such filtering at 208 the resultant audio stream is still not pretty much flat in frequency response (i.e., it has distinct high and low frequencies), it is relatively certain that there is music in the audio stream at 212. The original audio stream (or its compressed signature) can then be sent to the content identification server at 216. Also, it is noted that since content identification is user-initiated, one can expect that the past 6 sec of audio stream has something worth identifying, and hence send it up to the content identification server without a music check. The server will merely return a negative result if the audio stream is not a recognized as piece of music.
Also, in order to conserve memory space and have the option of storing more copies of past audio streams so that users can access beyond the past 6 seconds (e.g., access audio streams 30 sec earlier), the audio streams can be compressed into signature files which the content identification server recognizes and expects. The corresponding audio information can be tagged (as to whether it is a song and the time stamp (to the signature files.) Thus, a television receiver device consistent with certain embodiments has a display associated with the television receiver device. A filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of PCM digital audio data, wherein the filter separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets. A circular buffer stores a sample of approximately six seconds in duration of the digital audio data. A modem transmits the sample of audio data from the buffer to a content identification server and that receives metadata identifying the audio data from the content identification server. A storage device stores the metadata to a database or history list. A metadata parser selects predefined elements of metadata from the metadata for display on the display. A graphics overlay processor generates a graphics overlay containing the metadata. Either the sample of audio data is sent to the content identification server upon receipt of a user command from a remote control or the metadata is rendered to the display upon receipt of a user command from a remote control.
Another television receiver device has a display associated with the television receiver device. A filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data. A buffer stores a sample of the digital audio data. A modem transmits the sample of audio data from the buffer to a content identification server and that receives metadata identifying the audio data from the content identification server. A display processor renders at least a portion of the metadata to the display.
In certain implementations, the digital audio data comprises Pulse Code Modulated (PCM) audio data. In certain implementations, the buffer comprises a circular buffer that continuously stores a defined quantity of audio data. In certain implementations, the defined quantity of audio data comprises approximately six seconds of audio. In certain implementations, the filter separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets. In certain implementations, a metadata parser selects predefined elements of metadata from the metadata for display on the display. In certain implementations, the display processor comprises a graphics overlay processor that generates a graphics overlay containing the metadata. In certain implementations, the sample of audio data is sent to the content identification server upon receipt of a user command from a remote control. In certain implementations, the metadata is rendered to the display upon receipt of a user command from a remote control. In certain implementations, the metadata are stored to a database or history list on a storage device.
A television receiver device consistent with certain implementations has a display associated with the television receiver device. A filter converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital content of a specified category. A buffer stores a sample of the digital content. A modem transmits the sample of digital content from the buffer to a content identification server and that receives metadata identifying the digital content from the content identification server. A display processor renders at least a portion of the metadata to the display.
In certain implementations, the content comprises video content data. In certain implementations, the buffer comprises a circular buffer that continuously stores a defined quantity of the content. In certain implementations, the filter separates the specified category of content by use of packet identifiers to separate the specified category of content from other packets and produces the stream of digital content by removing packet overhead from packets of the specified category. In certain implementations, a metadata parser selects predefined elements of metadata from the metadata for display on the display. In certain implementations, the display processor comprises a graphics overlay processor that generates a graphics overlay containing the metadata. In certain implementations, the sample of content is sent to the content identification server upon receipt of a user command from a remote control. In certain implementations, the metadata is rendered to the display upon receipt of a user command from a remote control. In certain implementations, the metadata are stored to a database or history list on a storage device.
A method of rendering content metadata to a display associated with a television receiver device in a manner consistent with certain embodiments involves providing a display associated with the television receiver device; converting a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data; buffering a sample of the digital audio data in a buffer storage device; transmitting the sample of audio data from the buffer to a content identification server; receiving metadata identifying the audio data from the content identification server; and rendering at least a portion of the metadata to the display.
In certain implementations, the digital audio data comprises Pulse Code Modulated (PCM) audio data. In certain implementations, the buffering is carried out in a circular buffer that continuously stores a defined quantity of audio data. In certain implementations, the defined quantity of audio data comprises approximately six seconds of audio. In certain implementations, the filtering separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets. In certain implementations, the method further involves parsing the metadata to select predefined elements of metadata from the metadata for display on the display. In certain implementations, the predefined elements of metadata are rendered to the display using a graphics overlay processor that generates a graphics overlay containing the metadata. In certain implementations, the sample of audio data is sent to the content identification server upon receipt of a user command from a remote control. In certain implementations, the metadata is rendered to the display upon receipt of a user command from a remote control. In certain implementations, the method further involves storing the metadata to a database or history list on a storage device. In certain implementations, prior to transmitting a determination can be made as to whether the audio sample has song attributes.
Any of the above methods can be carried out using a tangible computer readable electronic storage medium storing instructions which, when executed on one or more programmed processors, carry out the method.
It is noted that while the above examples have focused on the identification of music within a television program, the same type of technology could be readily modified to sample and identify video or text or other types of content.
Those skilled in the art will recognize, upon consideration of the above teachings, that certain of the above exemplary embodiments are based upon use of a programmed processor. However, the invention is not limited to such exemplary embodiments, since other embodiments could be implemented using hardware component equivalents such as special purpose hardware and/or dedicated processors. Similarly, general purpose computers, microprocessor based computers, micro-controllers, optical computers, analog computers, dedicated processors, application specific circuits and/or dedicated hard wired logic may be used to construct alternative equivalent embodiments.
Certain embodiments described herein, are or may be implemented using a programmed processor executing programming instructions that are broadly described above in flow chart form that can be stored on any suitable electronic or computer readable storage medium. However, those skilled in the art will appreciate, upon consideration of the present teaching, that the processes described above can be implemented in any number of variations and in many suitable programming languages without departing from embodiments of the present invention. For example, the order of certain operations carried out can often be varied, additional operations can be added or operations can be deleted without departing from certain embodiments of the invention. Error trapping can be added and/or enhanced and variations can be made in user interface and information presentation without departing from certain embodiments of the present invention. Such variations are contemplated and considered equivalent.
While certain embodiments herein were described in conjunction with specific circuitry that carries out the functions described, other embodiments are contemplated in which the circuit functions are carried out using equivalent executed on one or more programmed processors. General purpose computers, microprocessor based computers, micro-controllers, optical computers, analog computers, dedicated processors, application specific circuits and/or dedicated hard wired logic and analog circuitry may be used to construct alternative equivalent embodiments. Other embodiments could be implemented using hardware component equivalents such as special purpose hardware and/or dedicated processors.
Certain embodiments may be implemented using a programmed processor executing programming instructions that in certain instances are broadly described above in flow chart form that can be stored on any suitable electronic or computer readable storage medium (such as, for example, disc storage, Read Only Memory (ROM) devices, Random Access Memory (RAM) devices, network memory devices, optical storage elements, magnetic storage elements, magneto-optical storage elements, flash memory, core memory and/or other equivalent volatile and non-volatile storage technologies) and/or can be transmitted over any suitable electronic communication medium. However, those skilled in the art will appreciate, upon consideration of the present teaching, that the processes described above can be implemented in any number of variations and in many suitable programming languages without departing from embodiments of the present invention. For example, the order of certain operations carried out can often be varied, additional operations can be added or operations can be deleted without departing from certain embodiments of the invention. Error trapping can be added and/or enhanced and variations can be made in user interface and information presentation without departing from certain embodiments of the present invention. Such variations are contemplated and considered equivalent.
While certain illustrative embodiments have been described, it is evident that many alternatives, modifications, permutations and variations will become apparent to those skilled in the art in light of the foregoing description.

Claims (26)

What is claimed is:
1. A television receiver device, comprising:
a television display forming a part of the television receiver device;
a filter forming a part of the television receiver device that converts a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of PCM digital audio data, where the filter separates the audio from audio video content by use of packet identifiers to separate audio packets from other packets and produces the stream of digital audio data by removing packet overhead from the audio packets;
a circular buffer forming a part of the television receiver device that stores a sample of a predetermined duration of the stream of PCM digital audio data;
a modem forming a part of the television receiver device that transmits the sample of PCM digital audio data from the buffer to a content identification server and that receives metadata identifying the audio data from the content identification server;
where the predetermined duration is at least as long as a time required by the content identification server to identify content;
a storage device forming a part of the television receiver device, where the metadata are stored to a database or history list on the storage device;
a metadata parser forming a part of the television receiver device that selects predefined elements of metadata from the metadata for display on the display;
a graphics overlay processor forming a part of the television receiver device that generates a graphics overlay containing the metadata; and
where the sample of PCM digital audio data is sent to the content identification server upon receipt of a first user command from a remote control and the metadata is rendered to the display upon receipt of a second user command from the remote control.
2. A television receiver device, comprising:
a display associated with the television receiver device;
a filter forming a part of the television receiver device that is configured to convert a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data, wherein the filter is configured to separate the audio from audio video content by use of packet identifiers to separate audio packets from other packets and is further configured to produce the stream of digital audio data by removing packet overhead from the audio packets;
a circular buffer forming a part of the television receiver device that that is configured to store a sample of the digital audio data;
a modem forming a part of the television receiver device that is configured to transmit the sample of digital audio data from the buffer to a content identification server and that is further configured to receive metadata identifying the audio data from the content identification server;
a metadata parser forming a part of the television receiver device that is configured to select predefined elements of metadata from the metadata for display on the display; and
a display processor forming a part of the television receiver device that is configured to render at least a portion of the metadata to the display.
3. The television receiver device according to claim 2, where the digital audio data comprises Pulse Code Modulated (PCM) audio data.
4. The television receiver device according to claim 2, where the circular buffer is configured to continuously store a defined quantity of digital audio data that is at least as large as required by the content identification server.
5. The television receiver device according to claim 4, where the defined quantity of digital audio data comprises at least six seconds of audio.
6. The television receiver according to claim 2, where the display processor comprises a graphics overlay processor that is configured to generate a graphics overlay containing at least a portion of the metadata.
7. The television receiver according to claim 2, further comprising a control processor configured to cause the sample of digital audio data is sent to the content identification server upon receipt of a user command at the television receiver device from a remote control.
8. The television receiver according to claim 2, where the display processor is configured to render the metadata to the display upon receipt of a user command from a remote control.
9. The television receiver according to claim 2, where the metadata are stored to a database or history list on a storage device coupled to the television receiver device.
10. A television receiver device, comprising:
a display associated with the television receiver device;
a filter forming a part of the television receiver device that is configured to convert a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital content of a specified category, wherein the filter is configured to separate the specified category of digital content by use of packet identifiers to separate the specified category of digital content from other packets and is further configured to produce the stream of digital content by removing packet overhead from packets of the specified category;
a circular buffer forming a part of the television receiver device that is configured to store a sample of the digital content;
a modem forming a part of the television receiver device that is configured to transmit the sample of digital content from the buffer to a content identification server and that is further configured to receive metadata identifying the digital content from the content identification server;
a metadata parser that is configured to select predefined elements of metadata from the metadata for display on the display; and
a display processor forming a part of the television receiver device that is configured to render at least a portion of the metadata to the display.
11. The television receiver device according to claim 10, where the digital content comprises video content data.
12. The television receiver device according to claim 10, where the circular buffer is configured to continuously store a defined quantity of the digital content.
13. The television receiver according to claim 10, where the display processor comprises a graphics overlay processor that is configured to generate a graphics overlay containing the metadata.
14. The television receiver according to claim 10, further comprising a control processor configured to cause the sample of digital content to be sent to the content identification server upon receipt of a user command from a remote control.
15. The television receiver according to claim 10, where the display processor is configured to render the metadata to the display upon receipt of a user command from a remote control.
16. The television receiver according to claim 10, where the metadata are stored to a database or history list on a storage device forming a part of the television receiver device.
17. A method of rendering content metadata to a display associated with a television receiver device, comprising:
providing a display associated with the television receiver device;
at the television receiver device, converting a stream of audio/video content that is to be displayed on the display associated with the television receiver device into a stream of digital audio data, wherein the filtering separates the digital audio data from audio video content by use of packet identifiers to separate digital audio data packets from other packets and produces the stream of digital audio data by removing packet overhead from the digital audio data packets;
at the television receiver device, buffering a sample of the digital audio data in a circular buffer storage device;
at the television receiver device, transmitting the sample of digital audio data from the circular buffer storage device to a content identification server;
at the television receiver device, receiving metadata identifying the digital audio data from the content identification server;
at the television receiver device, parsing the metadata to select predefined elements of metadata from the metadata for display on the display; and at the television receiver device, rendering at least a portion of the metadata to the display.
18. The method according to claim 17, where the digital audio data comprises Pulse Code Modulated (PCM) audio data.
19. The method according to claim 17, where the circular buffer continuously stores a defined quantity of digital audio data.
20. The method according to claim 19, where the defined quantity of digital audio data comprises at least six seconds of audio.
21. The method according to claim 17, where the predefined elements of metadata are rendered to the display using a graphics overlay processor that generates a graphics overlay containing the metadata.
22. The method according to claim 17, where the sample of digital audio data is sent to the content identification server upon receipt of a user command from a remote control.
23. The method according to claim 17, where the metadata is rendered to the display upon receipt of a user command from a remote control.
24. The method according to claim 17, further comprising storing the metadata to a database or history list on a storage device forming part of the television receiver device.
25. The method according to claim 17, further comprising determining if the digital audio data has song attributes prior to the transmitting.
26. A non-transitory tangible computer readable electronic storage medium storing instructions which, when executed on one or more programmed processors, carry out a method according to claim 17.
US12/590,259 2009-11-05 2009-11-05 Automatic capture of data for acquisition of metadata Expired - Fee Related US8490131B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/590,259 US8490131B2 (en) 2009-11-05 2009-11-05 Automatic capture of data for acquisition of metadata

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/590,259 US8490131B2 (en) 2009-11-05 2009-11-05 Automatic capture of data for acquisition of metadata

Publications (2)

Publication Number Publication Date
US20110102684A1 US20110102684A1 (en) 2011-05-05
US8490131B2 true US8490131B2 (en) 2013-07-16

Family

ID=43925070

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/590,259 Expired - Fee Related US8490131B2 (en) 2009-11-05 2009-11-05 Automatic capture of data for acquisition of metadata

Country Status (1)

Country Link
US (1) US8490131B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104254007A (en) * 2014-09-03 2014-12-31 海信集团有限公司 Method and device for processing audio
CN105139850A (en) * 2015-08-12 2015-12-09 西安诺瓦电子科技有限公司 Speech interaction device, speech interaction method and speech interaction type LED asynchronous control system terminal
EP3023892A1 (en) * 2014-11-18 2016-05-25 Samsung Electronics Co., Ltd. Content processing device and method for transmitting segment of variable size
US11159845B2 (en) 2014-12-01 2021-10-26 Sonos, Inc. Sound bar to provide information associated with a media item

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5274305B2 (en) * 2009-02-27 2013-08-28 キヤノン株式会社 Image processing apparatus, image processing method, and computer program
US8490131B2 (en) * 2009-11-05 2013-07-16 Sony Corporation Automatic capture of data for acquisition of metadata

Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4739398A (en) * 1986-05-02 1988-04-19 Control Data Corporation Method, apparatus and system for recognizing broadcast segments
US5751672A (en) 1995-07-26 1998-05-12 Sony Corporation Compact disc changer utilizing disc database
US5987525A (en) 1997-04-15 1999-11-16 Cddb, Inc. Network delivery of interactive entertainment synchronized to playback of audio recordings
US6304523B1 (en) 1999-01-05 2001-10-16 Openglobe, Inc. Playback device having text display and communication with remote database of titles
US20020023267A1 (en) * 2000-05-31 2002-02-21 Hoang Khoi Nhu Universal digital broadcast system and methods
US6378035B1 (en) * 1999-04-06 2002-04-23 Microsoft Corporation Streaming information appliance with buffer read and write synchronization
US20020166128A1 (en) * 2000-07-28 2002-11-07 Tamotsu Ikeda Digital broadcasting system
US20030028796A1 (en) * 2001-07-31 2003-02-06 Gracenote, Inc. Multiple step identification of recordings
US20030086341A1 (en) * 2001-07-20 2003-05-08 Gracenote, Inc. Automatic identification of sound recordings
US20040119814A1 (en) * 2002-12-20 2004-06-24 Clisham Allister B. Video conferencing system and method
US20040143349A1 (en) * 2002-10-28 2004-07-22 Gracenote, Inc. Personal audio recording system
US20050086692A1 (en) * 2003-10-17 2005-04-21 Mydtv, Inc. Searching for programs and updating viewer preferences with reference to program segment characteristics
US20050273319A1 (en) * 2004-05-07 2005-12-08 Christian Dittmar Device and method for analyzing an information signal
US6983289B2 (en) 2000-12-05 2006-01-03 Digital Networks North America, Inc. Automatic identification of DVD title using internet technologies and fuzzy matching techniques
US7012183B2 (en) 2001-05-14 2006-03-14 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus for analyzing an audio signal with regard to rhythm information of the audio signal by using an autocorrelation function
US7167857B2 (en) 1997-04-15 2007-01-23 Gracenote, Inc. Method and system for finding approximate matches in database
US7228280B1 (en) 1997-04-15 2007-06-05 Gracenote, Inc. Finding database match for file based on file characteristics
US20070143777A1 (en) * 2004-02-19 2007-06-21 Landmark Digital Services Llc Method and apparatus for identificaton of broadcast source
US7282632B2 (en) 2004-09-28 2007-10-16 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung Ev Apparatus and method for changing a segmentation of an audio piece
US7304231B2 (en) 2004-09-28 2007-12-04 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung Ev Apparatus and method for designating various segment classes
US7308485B2 (en) 1997-04-15 2007-12-11 Gracenote, Inc. Method and system for accessing web pages based on playback of recordings
US20070288478A1 (en) * 2006-03-09 2007-12-13 Gracenote, Inc. Method and system for media navigation
US20080049704A1 (en) * 2006-08-25 2008-02-28 Skyclix, Inc. Phone-based broadcast audio identification
US20080187188A1 (en) * 2007-02-07 2008-08-07 Oleg Beletski Systems, apparatuses and methods for facilitating efficient recognition of delivered content
US7477739B2 (en) 2002-02-05 2009-01-13 Gracenote, Inc. Efficient storage of fingerprints
US20090041418A1 (en) 2007-08-08 2009-02-12 Brant Candelore System and Method for Audio Identification and Metadata Retrieval
US7549052B2 (en) 2001-02-12 2009-06-16 Gracenote, Inc. Generating and matching hashes of multimedia content
US20090217320A1 (en) * 2007-12-28 2009-08-27 Verizon Data Services Inc. Method and apparatus for providing displayable applications
US20110102684A1 (en) * 2009-11-05 2011-05-05 Nobukazu Sugiyama Automatic capture of data for acquisition of metadata

Patent Citations (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4739398A (en) * 1986-05-02 1988-04-19 Control Data Corporation Method, apparatus and system for recognizing broadcast segments
US5751672A (en) 1995-07-26 1998-05-12 Sony Corporation Compact disc changer utilizing disc database
US6061680A (en) 1997-04-15 2000-05-09 Cddb, Inc. Method and system for finding approximate matches in database
US7167857B2 (en) 1997-04-15 2007-01-23 Gracenote, Inc. Method and system for finding approximate matches in database
US6154773A (en) 1997-04-15 2000-11-28 Cddb, Inc. Network delivery of interactive entertainment complementing audio recordings
US6161132A (en) 1997-04-15 2000-12-12 Cddb, Inc. System for synchronizing playback of recordings and display by networked computer systems
US6230192B1 (en) 1997-04-15 2001-05-08 Cddb, Inc. Method and system for accessing remote data based on playback of recordings
US6230207B1 (en) 1997-04-15 2001-05-08 Cddb, Inc. Network delivery of interactive entertainment synchronized to playback of audio recordings
US6240459B1 (en) 1997-04-15 2001-05-29 Cddb, Inc. Network delivery of interactive entertainment synchronized to playback of audio recordings
US5987525A (en) 1997-04-15 1999-11-16 Cddb, Inc. Network delivery of interactive entertainment synchronized to playback of audio recordings
US6330593B1 (en) 1997-04-15 2001-12-11 Cddb Inc. System for collecting use data related to playback of recordings
US7308485B2 (en) 1997-04-15 2007-12-11 Gracenote, Inc. Method and system for accessing web pages based on playback of recordings
US7228280B1 (en) 1997-04-15 2007-06-05 Gracenote, Inc. Finding database match for file based on file characteristics
US6304523B1 (en) 1999-01-05 2001-10-16 Openglobe, Inc. Playback device having text display and communication with remote database of titles
US6378035B1 (en) * 1999-04-06 2002-04-23 Microsoft Corporation Streaming information appliance with buffer read and write synchronization
US20020023267A1 (en) * 2000-05-31 2002-02-21 Hoang Khoi Nhu Universal digital broadcast system and methods
US20020166128A1 (en) * 2000-07-28 2002-11-07 Tamotsu Ikeda Digital broadcasting system
US6983289B2 (en) 2000-12-05 2006-01-03 Digital Networks North America, Inc. Automatic identification of DVD title using internet technologies and fuzzy matching techniques
US7549052B2 (en) 2001-02-12 2009-06-16 Gracenote, Inc. Generating and matching hashes of multimedia content
US7012183B2 (en) 2001-05-14 2006-03-14 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus for analyzing an audio signal with regard to rhythm information of the audio signal by using an autocorrelation function
US20030086341A1 (en) * 2001-07-20 2003-05-08 Gracenote, Inc. Automatic identification of sound recordings
US7328153B2 (en) 2001-07-20 2008-02-05 Gracenote, Inc. Automatic identification of sound recordings
US20030028796A1 (en) * 2001-07-31 2003-02-06 Gracenote, Inc. Multiple step identification of recordings
US7477739B2 (en) 2002-02-05 2009-01-13 Gracenote, Inc. Efficient storage of fingerprints
US20040143349A1 (en) * 2002-10-28 2004-07-22 Gracenote, Inc. Personal audio recording system
US20040119814A1 (en) * 2002-12-20 2004-06-24 Clisham Allister B. Video conferencing system and method
US20050086692A1 (en) * 2003-10-17 2005-04-21 Mydtv, Inc. Searching for programs and updating viewer preferences with reference to program segment characteristics
US20070143777A1 (en) * 2004-02-19 2007-06-21 Landmark Digital Services Llc Method and apparatus for identificaton of broadcast source
US20050273319A1 (en) * 2004-05-07 2005-12-08 Christian Dittmar Device and method for analyzing an information signal
US7565213B2 (en) 2004-05-07 2009-07-21 Gracenote, Inc. Device and method for analyzing an information signal
US7304231B2 (en) 2004-09-28 2007-12-04 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung Ev Apparatus and method for designating various segment classes
US7345233B2 (en) 2004-09-28 2008-03-18 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung Ev Apparatus and method for grouping temporal segments of a piece of music
US7282632B2 (en) 2004-09-28 2007-10-16 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung Ev Apparatus and method for changing a segmentation of an audio piece
US20070288478A1 (en) * 2006-03-09 2007-12-13 Gracenote, Inc. Method and system for media navigation
US20080049704A1 (en) * 2006-08-25 2008-02-28 Skyclix, Inc. Phone-based broadcast audio identification
US20080187188A1 (en) * 2007-02-07 2008-08-07 Oleg Beletski Systems, apparatuses and methods for facilitating efficient recognition of delivered content
US20090041418A1 (en) 2007-08-08 2009-02-12 Brant Candelore System and Method for Audio Identification and Metadata Retrieval
US20090217320A1 (en) * 2007-12-28 2009-08-27 Verizon Data Services Inc. Method and apparatus for providing displayable applications
US20110102684A1 (en) * 2009-11-05 2011-05-05 Nobukazu Sugiyama Automatic capture of data for acquisition of metadata

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104254007A (en) * 2014-09-03 2014-12-31 海信集团有限公司 Method and device for processing audio
EP3023892A1 (en) * 2014-11-18 2016-05-25 Samsung Electronics Co., Ltd. Content processing device and method for transmitting segment of variable size
CN105611400A (en) * 2014-11-18 2016-05-25 三星电子株式会社 Content processing device and method for transmitting segment of variable size
KR20160059131A (en) * 2014-11-18 2016-05-26 삼성전자주식회사 Contents processing device and method for transmitting segments of variable size and computer-readable recording medium
US9910919B2 (en) 2014-11-18 2018-03-06 Samsung Electronics Co., Ltd. Content processing device and method for transmitting segment of variable size, and computer-readable recording medium
CN105611400B (en) * 2014-11-18 2020-11-06 三星电子株式会社 Content processing apparatus and method for transmitting variable-size segments
US11159845B2 (en) 2014-12-01 2021-10-26 Sonos, Inc. Sound bar to provide information associated with a media item
US11743533B2 (en) 2014-12-01 2023-08-29 Sonos, Inc. Sound bar to provide information associated with a media item
CN105139850A (en) * 2015-08-12 2015-12-09 西安诺瓦电子科技有限公司 Speech interaction device, speech interaction method and speech interaction type LED asynchronous control system terminal

Also Published As

Publication number Publication date
US20110102684A1 (en) 2011-05-05

Similar Documents

Publication Publication Date Title
RU2632403C2 (en) Terminal device, server device, method of information processing, program and system of related application delivery
US10194199B2 (en) Methods, systems, and computer program products for categorizing/rating content uploaded to a network for broadcasting
RU2601446C2 (en) Terminal apparatus, server apparatus, information processing method, program and interlocked application feed system
CA2924065C (en) Content based video content segmentation
EP2406732B1 (en) Bookmarking system
US8307403B2 (en) Triggerless interactive television
WO2017063399A1 (en) Video playback method and device
US20160073141A1 (en) Synchronizing secondary content to a multimedia presentation
US9009751B2 (en) Systems and methods for searching based on information in commercials
US20120315014A1 (en) Audio fingerprinting to bookmark a location within a video
US8490131B2 (en) Automatic capture of data for acquisition of metadata
KR101992475B1 (en) Using an audio stream to identify metadata associated with a currently playing television program
US20070199037A1 (en) Broadcast program content retrieving and distributing system
JP5135024B2 (en) Apparatus, method, and program for notifying content scene appearance
US9489421B2 (en) Transmission apparatus, information processing method, program, reception apparatus, and application-coordinated system
KR20070043372A (en) System for management of real-time filtered broadcasting videos in a home terminal and a method for the same
US8966533B2 (en) Receiving apparatus, information processing method, program, transmitting apparatus, and application interlocking system for acquiring and executing an application in conjunction with reproduction of content
US11551722B2 (en) Method and apparatus for interactive reassignment of character names in a video device
JP6112598B2 (en) Information acquisition apparatus, information acquisition method, and information acquisition program
JP2022183550A (en) Receiving device, client terminal device, and program
KR20120031671A (en) A method for automatically providing dictionary of foreign language for a display device
EP3044728A1 (en) Content based video content segmentation
KR20160036658A (en) Method, apparatus and system for covert advertising

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SUGIYAMA, NOBUKAZU;CHEE, JAIME;DUNN, TED;AND OTHERS;SIGNING DATES FROM 20091102 TO 20091103;REEL/FRAME:023604/0184

Owner name: SONY ELECTRONICS INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SUGIYAMA, NOBUKAZU;CHEE, JAIME;DUNN, TED;AND OTHERS;SIGNING DATES FROM 20091102 TO 20091103;REEL/FRAME:023604/0184

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20170716