How video codec works

Advertisements

How to measure video quality perception

Update 2(01/06/2016): Fixed reference video bitrate unit from Kbps to KBps

Update 1(10/16/2016): Anne Aaron presented the VMAF at the Demuxed 2016.

When working with videos, you should be focusing all your efforts on best quality of streaming, less bandwidth usage, and low latency in order to deliver the best experience for the users.

This is not an easy task. You often need to test different bitrates, encoder parameters, fine tune your CDN and even try new codecs. You usually run a process of testing a combination of configurations and codecs and check the final renditions with your naked eyes. This process doesn’t scale, can’t we just trust computers to check that?

bit rate (bitrate): is a measure often used in digital video, usually it is assumed the rate of bits per seconds, it is one of the many terms used in video streaming.

screen-shot-2016-10-08-at-9-30-26-am
same resolution, different bitrates.

codec: is an electronic circuit or software that compresses or decompresses digital content. (ex: H264 (AVC), VP9, AAC (HE-AAC), AV1 and etc)

We were about to start a new hack day session here at Globo.com and since some of us learned how to measure the noise introduced when encoding and compressing images, we thought we could play with the stuff we learned by applying the methods to measure video quality.

We started by using the PSNR (peak signal-to-noise ratio) algorithm which can be defined in terms of the mean squared error (MSE) in decibel scale.

PSNR: is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise.

First, you calculate the MSE which is the average of the squares of the errors and then you normalize it to decibels.

For 3D signals (colored image), your MSE needs to sum all the means for each plane (ie: RGB, YUV and etc) and then divide by 3 (or 3 * MAX ^ 2).

To validate our idea, we downloaded videos (720p, h264) with the bitrate of 3400 kbps from distinct groups like News, Soap Opera and Sports. We called this group of videos the pivots or reference videos. After that, we generated some transrated versions of them with lower bitrates. We created 700 kbps, 900 kbps, 1300 kbps, 1900 kbps and 2800 kbps renditions for each reference video.

Heads Up! Typically the pivot video (most commonly referred to as reference video), uses a truly lossless compression, the bitrate for a YUV420p raw video should be 1280x720x1.5(given the YUV420 format)x24fps /1000 = 33177.6KBps, far more than what we used as reference (3400KBps).

We extracted 25 images for each video and calculate the PSNR comparing the pivot image with the modified ones. Finally, we calculate the mean. Just to help you understand the numbers below, a higher PSNR means that the image is more similar to the pivot.

700 kbps 900 kbps 1300 kbps 1900 kbps 2800 kbps 3400 kbps
Soap Op. 35.0124 36.5159 38.6041 40.3441 41.9447
News 28.6414 30.0076 32.6577 35.1601 37.0301
Sports 32.5675 34.5158 37.2104 39.4079 41.4540
screen-shot-2016-10-08-at-9-15-24-am
A visual sample.

We defined a PSNR of 38 (from our observations) as the ideal but then we noticed that the News group didn’t meet the goal. When we plotted the News data in the graph we could see what happened.

The issue with the video from the News group is that they’re a combination of different sources: External traffic camera with poor resolution, talking heads in a studio camera with good resolution and quality, some scenes with computer graphics (like the weather report) and others. We suspected that the News average was affected by those outliers but this kind of video is part of our reality.

kitbcrnx2uuu4
The different video sources are visible in clusters. (PSNR(frames))

We needed a better way to measure the quality perception so we searched for alternatives and we reached one of the Netflix’s posts: an approach toward a practical perceptual video quality metric (VMAF). At first, we learned that PSNR does not consistently reflect human perception and that Netflix is creating ways to approach this with the VMAF model.

They created a dataset with several videos including videos that are not part of the Netflix library and put real people to grade it. They called this score of DMOS. Now they could compare how each algorithm scores against DMOS.

netflix
FastSSIM, PSNRHVS, PSNR and SSIM (y) vs DMOS (x)

They realized that none of them were perfect even though they have some strength in certain situations. They adopted a machine-learning based model to design a metric that seeks to reflect human perception of video quality (a Support Vector Machine (SVM) regressor).

The Netflix approach is much wider than using PSNR alone. They take into account more features like motion, different resolutions and screens and they even allow you train the model with your own video dataset.

“We developed Video Multimethod Assessment Fusion, or VMAF, that predicts subjective quality by combining multiple elementary quality metrics. The basic rationale is that each elementary metric may have its own strengths and weaknesses with respect to the source content characteristics, type of artifacts, and degree of distortion. By ‘fusing’ elementary metrics into a final metric using a machine-learning algorithm – in our case, a Support Vector Machine (SVM) regressor”

Netflix about VMAF

The best news (pun intended) is that the VMAF is FOSS by Netflix and you can use it now. The following commands can be executed in the terminal. Basically, with Docker installed, it installs the VMAF, downloads a video, transcodes it (using docker image of FFmpeg) to generate a comparable video and finally checks the VMAF score.

You saved around 1.89 MB (37%) and still got the VMAF score 94.

Using a composed solution like VMAF or VQM-VFD proved to be better than using a single metric, there are still issues to be solved but I think it’s reasonable to use such algorithms plus A/B tests given the impractical scenario of hiring people to check video impairments.

A/B tests: For instance, you could use X% of your user base for Y days offering them the newest changes and see how much they would reject it.

Functional Programing 101 :: WWH

function-machine

WWH: What? Why? How?

  1. What: a quick (hopefully, useful to real world) guide to functional programing using JavaScript strongly based on most adequate book.
  2. Why: it might empower you to write more robust programs: reusable, shorter, easier to reason about, less prone to error among others.
  3. How: by providing a quick textual introduction (WWH) followed by a simple code example and when possible a real code example.

Intro :: concepts

Functional Programing

What: a way to build code in which you use functions as the main design tool.

Why: might lead to code that’s easier to test, debug, parallelize, and understand.

How: thinking about what programs should do instead of how, using functions as the major unit to solve problems on computer.

First Class Functions

What: “functions are like any other data type and there is nothing particularly special about them – they may be stored in arrays, passed around, assigned to variables.”

Why: use functions to compose programs in a style that you can easily reason about, maintain, reuse and grow.

How: just create and use functions to solve problems.

Pure Functions

What: “a function that, given the same input, will always return the same output and does not have any observable side effect.”

Why: with pure functions we can easily cache, debug, test and parallelize the processing of them. There is no state to understand / set up.

How: write functions that does not have side effect. Although we’ll eventually write programs that mutate values, we can certainly try to minimize it. (And when we do need to mutate values, we can use functions to help us)

Basic toolbox :: currying

What: “You can call a function with fewer arguments than it expects. It returns a function that takes the remaining arguments.”

Why: you can promote the reusability to function level, you can use them to compose programs that expects another function

How: build a function with n parameters that returns n functions instead of the immediate result.

Medium toolbox :: composing

What: is the act of creating your programs using lots of functions.

Why: this promotes the reuse at a great level and forces you to think about what instead of how.

How: chain functions to produce a new callable function.

Example :: motivational

What: a better example to motivate you to go further with functional programing.

Why: most near real world examples are great to motivate you to learn something.

How: since you can see all the concepts together, I think you’ll notice the value.

You can see the example running at https://jsfiddle.net/swmrmgur/2/ and check the commented code down bellow.

Screen Shot 2016-04-27 at 2.17.19 PM

Advanced toolbox & conclusion

I hope you might see the benefits you can have from using one or other technique from functional programming but for sure there are other benefits not shown here, I strongly recommend you to read the INCREDIBLE free book (gitbook) “Professor Frisby’s Mostly Adequate Guide to Functional Programming”, in fact, most of the ideas and examples here are from it.

There are advanced techniques to deal with data mutation with less pain, to handle errors and exceptions without try and catch and more abstractions that can help you and you can read them on the book.

And don’t use the handcrafted curry and compose built here (they’re far from production-ready), instead use a library like Ramda, which provides many basic functions like: map, filter and other all of them already curried, or lodash-fp.

Yeah, there no monado here. A special thank to Daniel Martins and Juarez Bochi, they helped a lot.

cover

3 tips to make you a better developer

four20puzzle

introduction

I’m sorry for the clickbait headline, I didn’t have a better idea/name for it.

We (developers) occasionally produce lazy/messy code and from time to time we need to remember the most important rule: “We do code to solve problems but also for human being be able: to use, to maintain and to evolute”. 

TLDR; (a unit can be a: function, var, method, class, parameter and etc)

  1. Naming your units with care and meaning;
  2. Try to see your code as a series of transformation;
  3. When possible make yours units generic.

Keep in mind that these tips are just my opinions and at the best they were based on: excellent books (Refactoring, DDD, Clean Coder and etc ), articles & blog posts,  excellent people I’ve worked/paired with,  presentations,  tweets and experiences.

naming is hard

Name your units with care and meaning. Your code should be easy to understand.

Although naming things is really hard, it is also extremely important. Let’s a see a snippet of code:

Let’s discuss about this code above:

  • the function topComments receives an id but is it the id from the comment, user, article? Let’s say it’s form the user, therefore userId should vanish this doubt.
  • the name of the function is topComments but it looks like it’s getting the top 10 latest comments only thus we could call it top10LatestCommentsFrom.
  • the ajax function accept two callbacks one in case of success (succCB) and otherwise an error (errCB), I believe we can call them: onSuccess and onError for better understanding.
  • all the arguments are using short names and we can have less confusing names just by using the entire name.
  • you got the ideia, naming things to let the code clear!

Although we still have so many problems in this code, now it’s easier to understand and we only named things properly.

For sure there are some cases when short names are just okay, for example: when you’re developing an emulator or virtual machine you often use short names like sp (stack pointer) and pc (program counter) or even doing a very generic unit.

filter -> union -> compact -> kick

Try to see and fit your code as transformations, one after another.

Some say that in computer science almost all the problems can be reduced to only two major problems: sort and count things (plus doing these in a distributed environment), anyway the point is: we usually are coding to make transformation over data.

For instance our function top10LatestCommentsFrom could be summarized in these steps:

  1. fetch comments (all)
  2. sort them (by date)
  3. filter them (only top)
  4. select the first 10

Which are just transformations over an initial list, we can make our function top10LatestCommentsFrom much better with that mindset.

 

By the way this could lead you to easily understand the new kid on the block sometimes referred as Functional Reactive Programming.

<be generic>

Work to make your units generic.

Let’s imagine you are in an interview process and your first task is to code a function which prints the numbers 1, 2 and 3 concatenated with “Hello, “. It should print: “Hello, 1” and then “Hello, 2″…

Now they ask you to print also the letters: “D”, “K” and “C”.

It was the first step toward the “generic”, now the interviewers say you have also to print a list of person’s name but now it’ll be a list of objects [{name: “person”},…].

Things start to get specific again and the interviewers want to test you. They ask you to print a list of car’s brand [{brand: “Ferrari”}, ..] plus a list of game consoles with their architecture [{name: “PS4”, arch: “x86-64”}, …]

Yikes, I suppose you’re not proud of that code and probably your interviewers will be little concerned about your skills with development, let’s list some of the problems with this approach.

  • Naming (we’re calling a person of an item)
  • High coupling (the function print knows too much about each printable)
  • Lots of (inner) conditionals 😦 it’s really hard to read/maintain/evolute this code

What we can do?! Well, it seems that all we need to do is to iterate through an array and prints an item but each item will require a different way of printing.

 

I said naming is important but when you make something very generic you should also make the abstract names not tied to any concrete concept. In fact, in Haskell (let’s pretend I know Haskell) when a concrete type of something may vary we use single letters to take their place.

Bonus round

  1. Make your units of execution to perform a single task.
  2. Use dispatch/pattern matching/protocol something instead of conditionals.
  3. Enforce DRY as much as you can.

Ruby overview

Introduction

Take a look at this, there is two classes: a person and a teacher. The person (originally) just know how to speak English and then teacher teach him speak other language, it can show you how powerful and beautiful ruby is.

class Person
 attr_accessor :name
 def speak_english
  puts "Hi people!"
 end
end

class BrPortugueseTeacher
 def teach(person)
  def person.speak_portuguese
   puts "Ola pessoal!"
  end
 end
end

bill_gates = Person.new
bill_gates.nome = "Bill Gates"
pasquale = Teacher.new
pasquale.teach bill_gates
#now bill gates knows portuguese
bill_gates.speak_portuguese

The intent here is just show a quick overview of ruby from a newbie.

Code’s comment

#one line comment
=begin
Multiply lines comment.
Given that ...
=end

String

String in ruby is mutable (but when you use the operator method + it creates another string, so to concatenate strings you should use the operator <<) and just a little tip avoid the concatenation by using + instead prefer use interpolation, a way to handle string very similar to expression language, and it’s faster than normal concatenation.

ran = 34434
who = "Leandro Moreira"
puts "#{who} generates this #{ran} number"

Conventions

Yet on mutability, when you write a method that can affect the internal state, you should use the bang operator (!) on method’s name.

old_source_name = "angeline"
puts old_source_name.capitalize
puts old_source_name.capitalize!

Another cool convention to Boolean methods is end them with ?

if account.cancelled?
 puts "Run Forest, run!"
end

Range object

In ruby you can use a type Range to describe ranges and its use is very easy.

zero_to_ten = (0..10) #inclusive
one_to_seven = (0...8) #exclusive
alphabetic = ('a'..'z') #you also can omit the (

It’s all object and quick tips

– Hey, language prints I win three times.

puts "I win " * 3

You can use anything on if statement and it can ben true or false (and nil which is false too).

A weird thing is one way of handle the regular expressions.

/myexp/ =~ "sentence"
#"sentence" matches myexp?

Another weird operator is or equals.

list ||= flights
#the list will just receive the flights if list is nil.

The classes are really open

One of the main features of ruby is Open Class, this is cool, you just can grab a class and write a new feature for it.

class String
 def do_nothing
  puts "doing nothig"
 end
end

And then you just call it.

"number".do_nothing

Let’s trick the addition operations on number.

class Fixnum
 def +(other)
  self - other
 end
end
puts 2+1
#and it will prints 1. (~:

Variable arguments

Sometimes you need to use this kind of flexibility.

user.buy computer
user.buy computer, mouse, monitor

To achieve this you just use the syntax. The splat operator how it is known.

def buy(*products)
 #buy logic
end

Hash enhancements

There is a lot of people which claims to use hash as parameter.

e_account.transfer :to_account => my, :value => 4800

def transfer (parameters)
 dest_account = parameters[:to_account]
 #...
end

Declarations

class Anything
 @field #object field
 @@field #class field
end

Singleton in Ruby

Singleton pattern is a design pattern used to implement the mathematical concept of a singleton, by restricting the instantiation of a class to one object. This is useful when exactly one object is needed to coordinate actions across the system. The concept is sometimes generalized to systems that operate more efficiently when only one object exists, or that restrict the instantiation to a certain number of objects (say, five). Some consider it an anti-pattern, judging that it is overused, introduces unnecessary limitations in situations where a sole instance of a class is not actually required, and introduces global state into an application.  (From wikipedia)

class HyperDao
@@instance = HyperDao.new
def self.instance
return @@instance
end
private_class_method :new
end

But we’re talking about ruby, don’t we?

require 'singleton'
class God
include Singleton
end

It’s done! 😀

Equals

If you want or need to rewrite the equals…

def ==(other)
 self.id = other.id
end

Duck typing – good-bye interface

Duck typing is a style of dynamic typing in which an object’s current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class or implementation of a specific interface. The name of the concept refers to the duck test, attributed to James Whitcomb Riley (see History below), which may be phrased as follows:”When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.” (From Wikipedia)

class PremiumAccount
 def saldo
   #
 end
end

class CommonAccount
 def saldo
  #
 end
end

The bank manager will accept that.

class BankManager
 def total_debt(accounts)
  for account in accounts do
   debt += account.saldo
  end
 end
end

Mixin

Mixin is a class that provides a certain functionality to be inherited or just reused by a subclass, while not meant for instantiation (the generation of objects of that class). Inheriting from a mixin is not a form of specialization but is rather a means of collecting functionality. A class may inherit most or all of its functionality from one or more mixins through multiple inheritance. (Again, from Wikipedia)

module Logging
 def log(message)
  puts message
 end
end

class Anything
 include Logging
  #...
end

any = Anything.new
any.log "It started now!"

Metaprogramming

Metaprogramming is the writing of computer programs that write or manipulate other programs (or themselves) as their data, or that do part of the work at compile time that would otherwise be done at runtime. In many cases, this allows programmers to get more done in the same amount of time as they would take to write all the code manually, or it gives programs greater flexibility to efficiently handle new situations without recompilation.

class Person
 attr_accessor :name
 def speak_english
  puts "Hi people!"
 end
end

class BrPortugueseTeacher
 def teach(person)
  def person.speak_portuguese
   puts "Ola pessoal!"
  end
 end
end

bill_gates = Person.new
bill_gates.nome = "Bill Gates"
pasquale = Teacher.new
pasquale.teach bill_gates
#now bill gates knows portuguese
bill_gates.speak_portuguese

Highly influenced by ruby on rails from Caelum.