Become a fan of Slashdot on Facebook

How Stanford Engineers Created a Fictitious Compression For HBO 90

Posted by timothy on Saturday July 26, 2014 @03:50AM from the buzzword-bingo dept.

Tekla Perry (3034735) writes Professor Tsachy Weissman and Ph.D student Vinith Misra came up with (almost) believable compression algorithms for HBO's Silicon Valley. Some constraints -- they had to seem plausible, look good when illustrated on a whiteboard, and work with the punchline, "middle out." Next season the engineers may encourage producers to tackle the challenge of local decodability.

This discussion has been archived. No new comments can be posted.

How Stanford Engineers Created a Fictitious Compression For HBO

Load All Comments

Search 90 Comments Log In/Create an Account

Comments Filter:

Stanford as a buzzword factory (Score:4, Funny)

by hax4bux ( 209237 ) writes: on Saturday July 26, 2014 @04:04AM (#47537323)

Now they can admit it.

Share
twitter facebook
- - Re: (Score:2)
    
    by thrillseeker ( 518224 ) writes:
    
    It doesn't have to be totally accurate - only humorous to a sufficiently large audience.
  - Re: The cast (Score:1)
    
    by Anonymous Coward writes:
    
    I concur, the directors don't extract the essence or haven't coached the actors to make it believeable enough to make it funny. Like I once worked with an RF guru from Quebec who instead of putting serial numbers on each aluminum casting he designed/tested for a CATV WAN to the curb with T1 in be 80's put names of a different girlfriend in his life for each Pole mounted unit.
    Another who knew how to tweak another 0.1 less loss in a microwave 3.4dB splitter and also how to transmit a burst that coul
Meh (Score:4, Insightful)

by ShakaUVM ( 157947 ) writes: on Saturday July 26, 2014 @04:08AM (#47537333) Homepage Journal

Anyone who knows anything about compression knows that universal lossless compression is impossible to always do, because if such an algorithm existed, you could run it repeatedly on a data source until you were down to a single bit. And uncompresing a single bit that could be literally anything is problematic.
I sort of wish they'd picked some other sort of woo.

Share
twitter facebook
- Re:Meh (Score:4)
  
  by Carewolf ( 581105 ) writes: on Saturday July 26, 2014 @04:35AM (#47537381) Homepage
  
  I don't think they mean univeral that way, I believe they mean universal lossless compression as gzip, bzip2 or 7zip. They will work on almost any data, but not all kinds of data. The idea here is that the show has a new way to do this that is supposed to be even better. The method they use remind me though of FLAC.
  
  Parent Share
  twitter facebook
  - Regarding compression (Score:1)
    
    by Anonymous Coward writes:
    
    Some 20 years ago when there were some choices of compression software I remember I ran some tests - and found that a utility called "Winace" is the only compression utility that produces a smaller compressed file than the original if compressing .avi, .mov, and .jpg files
    All the rest produced larger "compressed" files than the original
    - - Recompress the coefficients (Score:3)
        
        by tepples ( 727027 ) writes:
        
        JPEG is a lossy compression and it's impossible for an archiving utility using a lossless compression to best that.
        Of course it's possible. JPEG encoding has three steps: cosine transform of each block (DCT), then quantization (where the loss happens), then coding. In JPEG, the coding involves a zig-zag order and a Huffman/RLE structure, and this isn't necessarily optimal. A lossless compressor specially tuned for JPEG files could decode the quantized coefficients and losslessly encode them in a more efficient manner, producing a file that saves a few percent compared to the equivalent JPEG bitstream. Then on decompress
        
        Mod parent up - applicable to gzip/deflate (Score:1)
        
        by Sits ( 117492 ) writes:
        
        Sometimes you don't even need to change the file format - optimization can be applied to already compressed gzip/deflate files (which PNG uses) which can be used to create a more optimal deflate/gzip file. See tools like DeflOpt [encode.ru] and defluff [encode.ru] (DeflOpt can sometimes make even zopfli encoded files smaller [google.com]).
        
        JPEG recompression (Score:2)
        
        by Sits ( 117492 ) writes:
        
        ACT has a JPEG recompression test [compression.ca] which clearly shows a bunch of compressors making a JPEG smaller. Even better - there's a great paper by the author of packJPG talking about how to compress a JPEG losslessly [htw-aalen.de] using the technique teppples described...
        
        Re: (Score:2)
        
        by Fnord666 ( 889225 ) writes:
        
        Of course it's possible. JPEG encoding has three steps: cosine transform of each block (DCT), then quantization (where the loss happens), then coding. In JPEG, the coding involves a zig-zag order and a Huffman/RLE structure, and this isn't necessarily optimal. A lossless compressor specially tuned for JPEG files could decode the quantized coefficients and losslessly encode them in a more efficient manner, producing a file that saves a few percent compared to the equivalent JPEG bitstream. Then on decompression, it would decode these coefficients and reencode them back into a JPEG file.
        I believe what they meant was that you would not be able to apply a lossless algorithm to the original data stream and achieve greater compression than applying a lossy algorithm. Your composite algorithm is just a more efficient lossy algorithm.
        If we look at the original statement from an information theoretic point of view, the GP's statement should be easily understood. With a lossless algorithm, you have to encode all of the original information and restore it. Assuming an optimal encoding, it will
  - Re: (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    I haven't seen the show, but I have experience in dinking around with lossless compression, and suffice it to say, the problem would be solved if time travel existed, because then we could compress data that doesn't yet exist.
    Basically to do lossless, you have to compress data linearly. You can't compress the chunk of data it will get to in 10 seconds now on another core, because the precursor symbols do not yet exist. Now there is one way around this, but it's even more horribly inefficient, and that is by
  - Re: (Score:2)
    
    by Josh Coalson ( 538042 ) writes:
    
    The method they use remind me though of FLAC.
    FLAC is actually in the first episode for a few seconds; it was the baseline they were comparing against.
  - Re: (Score:2)
    
    by UnknownSoldier ( 67820 ) writes:
    
    Exactly. The first step in ANY compression algorithm is:
    Know Thy Data
    Your mention of FLAC is a perfect example.
- Re:Meh (Score:5, Funny)
  
  by hankwang ( 413283 ) writes: on Saturday July 26, 2014 @04:56AM (#47537411) Homepage
  
  "you could run it repeatedly on a data source until you were down to a single bit."
  That's why you need two distinct compression algorithms. Sometimes one will work better, sometimes the other. While repeatedly compressing, don't forget to write down in which sequence you need to apply the decompression. I believe this can compress abitrary data down to zero bits, if you are patient enough.
  
  Parent Share
  twitter facebook
  - - Re: (Score:2, Funny)
      
      by Anonymous Coward writes:
      
      You do the same thing you did the first time: two algorithms, write down the order. ;)
      - Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        I'm not sure you understand. Prepending the order to the compressed data would still increase the length of some files.
        (In before whoosh.)
        
        Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        The algorithm you describe is not compression, as there does not exist an input for which choice-of-algorithm metadata plus compressed data is smaller than the input data.
    - Re:Meh (Score:4, Funny)
      
      by AYeomans ( 322504 ) writes: <ajvNO@SPAMyeomans.org.uk> on Saturday July 26, 2014 @05:44AM (#47537479)
      
      Metadata? You just let the NSA store it for you.
      
      Parent Share
      twitter facebook
  - Re:Meh (Score:4, Funny)
    
    by serviscope_minor ( 664417 ) writes: on Saturday July 26, 2014 @08:29AM (#47537809) Journal
    
    While repeatedly compressing, don't forget to write down in which sequence you need to apply the decompression.
    Pretty much. I've found that I can do this. Essentially for N bits, I've got a large family (2^N) of compression algorithms. I pick the best one and write down it's number. The resulting data is 0 bits long, but there's a little metadata to store.
    
    Parent Share
    twitter facebook
- Re: (Score:3)
  
  by StripedCow ( 776465 ) writes:
  
  Meh. You only need the basic rules of physics to compute the universe from scratch, including all possible movies.
  - - Re: (Score:2)
      
      by MightyYar ( 622222 ) writes:
      
      are probabilistic
      I'm sorry, but that can't be right. If we relied on probability, even in an infinite universe we'd never see the likes of Mariah Carey's "Glitter".
    - Re: (Score:2)
      
      by iggymanz ( 596061 ) writes:
      
      physics is a manmade endeavor, the "laws of physics" are inventions of man. Reality may work another way, not according to any model that man's mind could devise.
- - Re: Meh (Score:3)
    
    by toxcspdrmn ( 471013 ) writes:
    
    You mean 1 (you forgot the parity bit).
- Re: (Score:3)
  
  by WoOS ( 28173 ) writes:
  
  > if such an algorithm existed, you could run it repeatedly on a data source until you were down to a single bit.
  Ah, but you are not describing universal lossless compression but universal lossless compression with a guaranteed compression ratio of better than 1:1.
  That indeed isn't possible but I can't see it claimed in TFA.
  - Re: (Score:2)
    
    by JeffAtl ( 1737988 ) writes:
    
    It may not have been claimed in the article, but it was claimed on the show itself.
- Re:Meh (Score:4, Interesting)
  
  by TeknoHog ( 164938 ) writes: on Saturday July 26, 2014 @08:49AM (#47537859) Homepage Journal
  
  Or if you're into math, you invoke the pigeonhole principle [wikipedia.org].
  
  Parent Share
  twitter facebook
  - Re:Meh (Score:5, Insightful)
    
    by pla ( 258480 ) writes: on Saturday July 26, 2014 @09:23AM (#47537917) Journal
    
    Or if you're into math, you invoke the pigeonhole principle
    
    Though technically true, in fairness we need to differentiate between meaningful data and noise. Yes, a universal compressor doesn't care. Human users of compression algorithms, for the most part, do care.
    
    So the limit of useful compression (Shannon aside) comes down to how well we can model the data. As a simple example, I can give you two 64 bit floats as parameters to a quadratic iterator, and you can fill your latest 6TB HDD with conventionally "incompressible" data as the output. If, however, you know the right model, you can recreate that data with a mere 16 bytes of input. Now extend that to more complex functions - Our entire understanding of "random" means nothing more than "more complex than we know how to model". As another example, the delay between decays in a sample of radioactive material - We currently consider that "random", but someday may discover that god doesn't play dice with the universe, and an entirely deterministic process underlies every blip on the ol' Geiger counter.
    
    So while I agree with you technically, for the purposes of a TV show? Lighten up. :)
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by TeknoHog ( 164938 ) writes:
      
      Or if you're into math, you invoke the pigeonhole principle So the limit of useful compression (Shannon aside) comes down to how well we can model the data. As a simple example, I can give you two 64 bit floats as parameters to a quadratic iterator, and you can fill your latest 6TB HDD with conventionally "incompressible" data as the output. If, however, you know the right model, you can recreate that data with a mere 16 bytes of input. Now extend that to more complex functions - Our entire understanding of "random" means nothing more than "more complex than we know how to model". As another example, the delay between decays in a sample of radioactive material - We currently consider that "random", but someday may discover that god doesn't play dice with the universe, and an entirely deterministic process underlies every blip on the ol' Geiger counter.
      IOW, Kolmogorov complexity [wikipedia.org]. For example, tracker and MIDI files are a great way to "compress" music, as they contain the actual notation/composition rather than the resulting sound. Of course, that doesn't account for all the redundancy in instruments/samples.
      So while I agree with you technically, for the purposes of a TV show? Lighten up. :)
      IMHO, half the fun of such TV shows is exactly in discussions like this -- what it got right, where it went wrong, how could we use the ideas in some real-world innovation. I find that deeper understanding only makes me enjoy things more, not less, an
    - Re: (Score:2)
      
      by Nemyst ( 1383049 ) writes:
      
      If we "someday" discover that radioactive decay is not inherently random and unpredictable to an atomic level, it'd mean we suddenly contradict a hundred years of scientific research, models and theories. While not impossible, your post implies that there's a model and we just don't know it; the truth is that it's extremely unlikely to be the case.
    - Re: (Score:3)
      
      by AmiMoJo ( 196126 ) * writes:
      
      16 bytes plus the model.
    - Re: (Score:2)
      
      by werepants ( 1912634 ) writes:
      
      Your definition of "random" and your understanding of quantum mechanics isn't quite right, although the rest of your post was quite interesting. A thought experiment known as EPR (Einstein-Podolsky-Rosen), followed up by the Stern-Gerlach apparatus and Bell's theorem, actually proved that certain attributes of particles in no way persist between measurements, and that they "choose" outcomes based on a mechanism that is not only unknown, but proven by experiment to be non-describable by deterministic theorie
- Re: (Score:2)
  
  by flargleblarg ( 685368 ) writes:
  
  Anyone who knows anything about compression knows that universal lossless compression is impossible to always do, because if such an algorithm existed, you could run it repeatedly on a data source until you were down to a single bit. And uncompresing a single bit that could be literally anything is problematic.
  Actually, anyone who knows anything about compression knows that universal lossless compression is always possible to do. Sometimes you get good compression, sometimes you get shitty compression, and sometimes you get 0% compression. Note that 0% compression is still compression — it's just a degenerate case.
  You are right, of course, that you can't repeatedly compress something down to a single bit — there is always a point of diminishing returns. But just because you can't compress something do
- Shannon's source coding theorem (Score:3)
  
  by eyebits ( 649032 ) writes:
  
  http://en.wikipedia.org/wiki/S... [wikipedia.org] "In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy."
- Re: (Score:2)
  
  by Junior J. Junior III ( 192702 ) writes:
  
  1
  Apparently the /. crapfilter doesn't like my compressed comment.
- - Re: (Score:2)
    
    by brantondaveperson ( 1023687 ) writes:
    
    Troll? Joke? Fundamental mis-understanding regarding the nature of information?
    Insufficient information to tell.
- Re: (Score:3, Interesting)
  
  by Horshu ( 2754893 ) writes:
  
  I wasn't even aware that programmers in Cali could even legally call themselves "engineers". I worked for a company out of college HQed in California, and I was told coming in that we used the term "Programmer/Analyst" because California required "engineers" to have a true engineering degree (with the requisite certifications et al)
  - Professional Engineer here (Score:1)
    
    by Anonymous Coward writes:
    
    I am a PE in California..
    one can do engineering in California without a license under the "industrial exemption", and even be called an engineer on your business card.
    What you can't do is hang up a shingle and run your own business as Joe Bloggs, Engineer, unless you have a license.
    A lot of companies have an HR policy that to be an "engineer" requires a 4 year degree, otherwise you are a "technician". To a certain extent, this is a "exempt" vs "non-exempt" (overtime) distinction. Engineers are exempt from
    - Re: (Score:2)
      
      by k6mfw ( 1182893 ) writes:
      
      What you can't do is hang up a shingle and run your own business as Joe Bloggs, Engineer, unless you have a license.
      true but I've seen non-licensed people who call themelves consulting engineers instead of consultants. Though many of these people use "engineer" but whaddaya gonna do, place them under citizen's arrest? However, civil engineers are very strict on licensing unlike vast number of silicon valley engineers.
      Engineers are exempt from overtime because they are "professional" (having conducted a course of advanced study), Technicians are not.
      reminds me of Dilbert cartoon where he is working lot of unpaid overtime where the hardhat maintenance technician either gets to go home at end of day or gets 1.5 or 2 times normal wage.
  - Re: (Score:2)
    
    by Quirkz ( 1206400 ) writes:
    
    I don't think that's true. Fully 50% of Silicon Valley job postings are for "XXX Engineer" and most of those are programming positions.
You want believability? (Score:1)

by stonedead ( 2571785 ) writes:

Ask CSI. GUI interface using visual basic to track the killer --> http://www.youtube.com/watch?v... [youtube.com] Seriously, the average joe that watches TV doesn't care about what algorithm is used, let alone know what an algorithm is.
- Re: (Score:2)
  
  by jones_supa ( 887896 ) writes:
  
  I disagree. We should not be dumbing down the GUIs just because some Average Joe does not get extra enjoyment from seeing a realistic one. I greatly appreciated the whiteboards showing some believable discrete cosine mathematics in Silicon Valley.
- Re: (Score:2)
  
  by barlevg ( 2111272 ) writes:
  
  Presumably there's a general compression algorithm that works on all data types, but Richard wrote specific optimizations for certain media types (the original goal of the project was to process music, so he probably started there).
Future troubles (Score:2)

by WoOS ( 28173 ) writes:

Now let's just hope that no aliens listening in to our broadcasting and no far future humans actually believe this and try to recreate the "groundbreaking compression algorithm" the whiz humans of the digital age came up with.
- - Re: (Score:2)
    
    by MightyYar ( 622222 ) writes:
    
    Aren't you thinking of encryption?
  - Pilot signal and FEC (Score:2)
    
    by tepples ( 727027 ) writes:
    
    Any digital broadcast will include two things that are recognizable as a broadcast. One is a pilot signal [wikipedia.org], used to communicate the existence of a signal to the receiver. Another is forward error correction, used to reconstruct a signal partially obscured by noise. Analog television included both: the pilot signal was the sync pulses during horizontal and vertical blanking, and the error correction was the presence of a double sideband in the bottom 1 MHz of the video.
He just described how MPEG works (sort of) (Score:1)

by paradigm82 ( 959074 ) writes:

He came up with the idea of using lossy compression techniques to compress the original file, then calculating the difference between the approximate reconstruction and the original file and compressing that data; by combining the two pieces, you have a lossless compressor.
This type of approach can work, Misra said, but, in most cases, would not be more efficient than a standard lossless compression algorithm, because coding the error usually isn’t any easier than coding the data.
Well, this is almost how MPEG movie compresion works - and it really does work! MPEG works by partly describing the next picture from the previous using motion vectors. These vectors described how the next picture will look based on movements of small-ish macroblocks on the original picture. Now, if that was the only element of the algorithm movies would look kind of strange (like paper-doll characters being moved about)! So the secret to make it work is to send extra information allowing the client to calc
- - Re: (Score:1)
    
    by paradigm82 ( 959074 ) writes:
    
    I don't think any of what you wrote goes against what I wrote. I have the exact same distinction that some components of MPEG movie compression (and my post applies to MPEG-2 also btw) are lossy, that is compensated for by other steps, but other steps yet again makes it lossy.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Stanford as a buzzword factory (Score:4, Funny)

Re: (Score:2)

Re: The cast (Score:1)

Meh (Score:4, Insightful)

Re:Meh (Score:4)

Regarding compression (Score:1)

Recompress the coefficients (Score:3)

Mod parent up - applicable to gzip/deflate (Score:1)

JPEG recompression (Score:2)

Re: (Score:2)

Re: (Score:2, Interesting)

Re: (Score:2)

Re: (Score:2)

Re:Meh (Score:5, Funny)

Re: (Score:2, Funny)

Re: (Score:2)

Re: (Score:2)

Re:Meh (Score:4, Funny)

Re:Meh (Score:4, Funny)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: Meh (Score:3)

Re: (Score:3)

Re: (Score:2)

Re:Meh (Score:4, Interesting)

Re:Meh (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Shannon's source coding theorem (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Professional Engineer here (Score:1)

Re: (Score:2)

Re: (Score:2)

You want believability? (Score:1)

Re: (Score:2)

Re: (Score:2)

Future troubles (Score:2)

Re: (Score:2)

Pilot signal and FEC (Score:2)

He just described how MPEG works (sort of) (Score:1)

Re: (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals