release 2017.01.24

[ChangeLog] Actualize
[pluralsight] Fix extraction (closes #11820 )
2017-01-24 02:58:37 +07:00 · 2017-01-24 02:56:19 +07:00 · 2017-01-24 02:51:45 +07:00 · 2017-01-23 23:38:31 +08:00 · 2017-01-23 23:37:32 +08:00 · 2017-01-23 03:50:39 +07:00
24 changed files with 505 additions and 66 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.18*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.18**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.24**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2017.01.18
+[debug] youtube-dl version 2017.01.24
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/25
+++ b/25
@@ -1,3 +1,28 @@
+version 2017.01.24
+
+Extractors
+* [pluralsight] Fix extraction (#11820)
+ [nextmedia] Add support for NextTV (壹電視)
+* [24video] Fix extraction (#11811)
+* [youtube:playlist] Fix nonexistent and private playlist detection (#11604)
+ [chirbit] Extract uploader (#11809)
+
+
+version 2017.01.22
+
+Extractors
+ [pornflip] Add support for pornflip.com (#11556, #11795)
+* [chaturbate] Fix extraction (#11797, #11802)
+ [azmedien] Add support for AZ Medien sites (#11784, #11785)
+ [nextmedia] Support redirected URLs
+ [vimeo:channel] Extract videos' titles for playlist entries (#11796)
+ [youtube] Extract episode metadata (#9695, #11774)
+ [cspan] Support Ustream embedded videos (#11547)
+ [1tv] Add support for HLS videos (#11786)
+* [uol] Fix extraction (#11770)
+* [mtv] Relax triforce feed regular expression (#11766)
+
+
 version 2017.01.18

 Extractors
--- a/README.md
+++ b/README.md
@@ -374,7 +374,7 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
                                     avprobe)
    --audio-format FORMAT            Specify audio format: "best", "aac",
                                     "vorbis", "mp3", "m4a", "opus", or "wav";
-                                     "best" by default
+                                     "best" by default; No effect without -x
    --audio-quality QUALITY          Specify ffmpeg/avconv audio quality, insert
                                     a value between 0 (better) and 9 (worse)
                                     for VBR or a specific bitrate like 128K
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -74,6 +74,8 @@
 - **awaan:live**
 - **awaan:season**
 - **awaan:video**
+ - **AZMedien**: AZ Medien videos
+ - **AZMedienShow**: AZ Medien shows
 - **Azubu**
 - **AzubuLive**
 - **BaiduVideo**: 百度视频
@@ -483,6 +485,7 @@
 - **Newstube**
 - **NextMedia**: 蘋果日報
 - **NextMediaActionNews**: 蘋果日報 - 動新聞
+ - **NextTV**: 壹電視
 - **nfb**: National Film Board of Canada
 - **nfl.com**
 - **NhkVod**
@@ -572,6 +575,7 @@
 - **PolskieRadio**
 - **PolskieRadioCategory**
 - **PornCom**
+ - **PornFlip**
 - **PornHd**
 - **PornHub**: PornHub and Thumbzilla
 - **PornHubPlaylist**
--- a/youtube_dl/extractor/azmedien.py
+++ b/youtube_dl/extractor/azmedien.py
@@ -0,0 +1,145 @@
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from .kaltura import KalturaIE
+from ..utils import (
+    get_element_by_class,
+    strip_or_none,
+)
+
+
+class AZMedienBaseIE(InfoExtractor):
+    def _kaltura_video(self, partner_id, entry_id):
+        return self.url_result(
+            'kaltura:%s:%s' % (partner_id, entry_id), ie=KalturaIE.ie_key(),
+            video_id=entry_id)
+
+
+class AZMedienIE(AZMedienBaseIE):
+    IE_DESC = 'AZ Medien videos'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:www\.)?
+                        (?:
+                            telezueri\.ch|
+                            telebaern\.tv|
+                            telem1\.ch
+                        )/
+                        [0-9]+-show-[^/\#]+
+                        (?:
+                            /[0-9]+-episode-[^/\#]+
+                            (?:
+                                /[0-9]+-segment-(?:[^/\#]+\#)?|
+                                \#
+                            )|
+                            \#
+                        )
+                        (?P<id>[^\#]+)
+                    '''
+
+    _TESTS = [{
+        # URL with 'segment'
+        'url': 'http://www.telezueri.ch/62-show-zuerinews/13772-episode-sonntag-18-dezember-2016/32419-segment-massenabweisungen-beim-hiltl-club-wegen-pelzboom',
+        'info_dict': {
+            'id': '1_2444peh4',
+            'ext': 'mov',
+            'title': 'Massenabweisungen beim Hiltl Club wegen Pelzboom',
+            'description': 'md5:9ea9dd1b159ad65b36ddcf7f0d7c76a8',
+            'uploader_id': 'TeleZ?ri',
+            'upload_date': '20161218',
+            'timestamp': 1482084490,
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        # URL with 'segment' and fragment:
+        'url': 'http://www.telebaern.tv/118-show-news/14240-episode-dienstag-17-januar-2017/33666-segment-achtung-gefahr#zu-wenig-pflegerinnen-und-pfleger',
+        'only_matching': True
+    }, {
+        # URL with 'episode' and fragment:
+        'url': 'http://www.telem1.ch/47-show-sonntalk/13986-episode-soldaten-fuer-grenzschutz-energiestrategie-obama-bilanz#soldaten-fuer-grenzschutz-energiestrategie-obama-bilanz',
+        'only_matching': True
+    }, {
+        # URL with 'show' and fragment:
+        'url': 'http://www.telezueri.ch/66-show-sonntalk#burka-plakate-trump-putin-china-besuch',
+        'only_matching': True
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        partner_id = self._search_regex(
+            r'<script[^>]+src=["\'](?:https?:)?//(?:[^/]+\.)?kaltura\.com(?:/[^/]+)*/(?:p|partner_id)/([0-9]+)',
+            webpage, 'kaltura partner id')
+        entry_id = self._html_search_regex(
+            r'<a[^>]+data-id=(["\'])(?P<id>(?:(?!\1).)+)\1[^>]+data-slug=["\']%s'
+            % re.escape(video_id), webpage, 'kaltura entry id', group='id')
+
+        return self._kaltura_video(partner_id, entry_id)
+
+
+class AZMedienShowIE(AZMedienBaseIE):
+    IE_DESC = 'AZ Medien shows'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:www\.)?
+                        (?:
+                            telezueri\.ch|
+                            telebaern\.tv|
+                            telem1\.ch
+                        )/
+                        (?P<id>[0-9]+-show-[^/\#]+
+                            (?:
+                                /[0-9]+-episode-[^/\#]+
+                            )?
+                        )$
+                    '''
+
+    _TESTS = [{
+        # URL with 'episode'
+        'url': 'http://www.telebaern.tv/118-show-news/13735-episode-donnerstag-15-dezember-2016',
+        'info_dict': {
+            'id': '118-show-news/13735-episode-donnerstag-15-dezember-2016',
+            'title': 'News - Donnerstag, 15. Dezember 2016',
+        },
+        'playlist_count': 9,
+    }, {
+        # URL with 'show' only
+        'url': 'http://www.telezueri.ch/86-show-talktaeglich',
+        'only_matching': True
+    }]
+
+    def _real_extract(self, url):
+        show_id = self._match_id(url)
+        webpage = self._download_webpage(url, show_id)
+
+        entries = []
+
+        partner_id = self._search_regex(
+            r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
+            webpage, 'kaltura partner id', default=None)
+
+        if partner_id:
+            entries = [
+                self._kaltura_video(partner_id, m.group('id'))
+                for m in re.finditer(
+                    r'data-id=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage)]
+
+        if not entries:
+            entries = [
+                self.url_result(m.group('url'), ie=AZMedienIE.ie_key())
+                for m in re.finditer(
+                    r'<a[^>]+data-real=(["\'])(?P<url>http.+?)\1', webpage)]
+
+        title = self._search_regex(
+            r'episodeShareTitle\s*=\s*(["\'])(?P<title>(?:(?!\1).)+)\1',
+            webpage, 'title',
+            default=strip_or_none(get_element_by_class(
+                'title-block-cell', webpage)), group='title')
+
+        return self.playlist_result(entries, show_id, title)
--- a/youtube_dl/extractor/chaturbate.py
+++ b/youtube_dl/extractor/chaturbate.py
@@ -1,5 +1,7 @@
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
 from ..utils import ExtractorError

@@ -31,30 +33,35 @@ class ChaturbateIE(InfoExtractor):

        webpage = self._download_webpage(url, video_id)

-        m3u8_url = self._search_regex(
-            r'src=(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage,
-            'playlist', default=None, group='url')
+        m3u8_formats = [(m.group('id').lower(), m.group('url')) for m in re.finditer(
+            r'hlsSource(?P<id>.+?)\s*=\s*(?P<q>["\'])(?P<url>http.+?)(?P=q)', webpage)]

-        if not m3u8_url:
+        if not m3u8_formats:
            error = self._search_regex(
                [r'<span[^>]+class=(["\'])desc_span\1[^>]*>(?P<error>[^<]+)</span>',
                 r'<div[^>]+id=(["\'])defchat\1[^>]*>\s*<p><strong>(?P<error>[^<]+)<'],
                webpage, 'error', group='error', default=None)
            if not error:
-                if any(p not in webpage for p in (
+                if any(p in webpage for p in (
                        self._ROOM_OFFLINE, 'offline_tipping', 'tip_offline')):
                    error = self._ROOM_OFFLINE
            if error:
                raise ExtractorError(error, expected=True)
            raise ExtractorError('Unable to find stream URL')

-        formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4')
+        formats = []
+        for m3u8_id, m3u8_url in m3u8_formats:
+            formats.extend(self._extract_m3u8_formats(
+                m3u8_url, video_id, ext='mp4',
+                # ffmpeg skips segments for fast m3u8
+                preference=-10 if m3u8_id == 'fast' else None,
+                m3u8_id=m3u8_id, fatal=False, live=True))
        self._sort_formats(formats)

        return {
            'id': video_id,
            'title': self._live_title(video_id),
-            'thumbnail': 'https://cdn-s.highwebmedia.com/uHK3McUtGCG3SMFcd4ZJsRv8/roomimage/%s.jpg' % video_id,
+            'thumbnail': 'https://roomimg.stream.highwebmedia.com/ri/%s.jpg' % video_id,
            'age_limit': self._rta_search(webpage),
            'is_live': True,
            'formats': formats,
--- a/youtube_dl/extractor/chirbit.py
+++ b/youtube_dl/extractor/chirbit.py
@@ -19,6 +19,7 @@ class ChirbitIE(InfoExtractor):
            'title': 'md5:f542ea253f5255240be4da375c6a5d7e',
            'description': 'md5:f24a4e22a71763e32da5fed59e47c770',
            'duration': 306,
+            'uploader': 'Gerryaudio',
        },
        'params': {
            'skip_download': True,
@@ -54,6 +55,9 @@ class ChirbitIE(InfoExtractor):
        duration = parse_duration(self._search_regex(
            r'class=["\']c-length["\'][^>]*>([^<]+)',
            webpage, 'duration', fatal=False))
+        uploader = self._search_regex(
+            r'id=["\']chirbit-username["\'][^>]*>([^<]+)',
+            webpage, 'uploader', fatal=False)

        return {
            'id': audio_id,
@@ -61,6 +65,7 @@ class ChirbitIE(InfoExtractor):
            'title': title,
            'description': description,
            'duration': duration,
+            'uploader': uploader,
        }


--- a/youtube_dl/extractor/cspan.py
+++ b/youtube_dl/extractor/cspan.py
@@ -12,6 +12,7 @@ from ..utils import (
    ExtractorError,
 )
 from .senateisvp import SenateISVPIE
+from .ustream import UstreamIE


 class CSpanIE(InfoExtractor):
@@ -22,14 +23,13 @@ class CSpanIE(InfoExtractor):
        'md5': '94b29a4f131ff03d23471dd6f60b6a1d',
        'info_dict': {
            'id': '315139',
-            'ext': 'mp4',
            'title': 'Attorney General Eric Holder on Voting Rights Act Decision',
-            'description': 'Attorney General Eric Holder speaks to reporters following the Supreme Court decision in [Shelby County v. Holder], in which the court ruled that the preclearance provisions of the Voting Rights Act could not be enforced.',
        },
+        'playlist_mincount': 2,
        'skip': 'Regularly fails on travis, for unknown reasons',
    }, {
        'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
-        'md5': '8e5fbfabe6ad0f89f3012a7943c1287b',
+        # md5 is unstable
        'info_dict': {
            'id': 'c4486943',
            'ext': 'mp4',
@@ -38,14 +38,11 @@ class CSpanIE(InfoExtractor):
        }
    }, {
        'url': 'http://www.c-span.org/video/?318608-1/gm-ignition-switch-recall',
-        'md5': '2ae5051559169baadba13fc35345ae74',
        'info_dict': {
            'id': '342759',
-            'ext': 'mp4',
            'title': 'General Motors Ignition Switch Recall',
-            'duration': 14848,
-            'description': 'md5:118081aedd24bf1d3b68b3803344e7f3'
        },
+        'playlist_mincount': 6,
    }, {
        # Video from senate.gov
        'url': 'http://www.c-span.org/video/?104517-1/immigration-reforms-needed-protect-skilled-american-workers',
@@ -57,12 +54,30 @@ class CSpanIE(InfoExtractor):
        'params': {
            'skip_download': True,  # m3u8 downloads
        }
+    }, {
+        # Ustream embedded video
+        'url': 'https://www.c-span.org/video/?114917-1/armed-services',
+        'info_dict': {
+            'id': '58428542',
+            'ext': 'flv',
+            'title': 'USHR07 Armed Services Committee',
+            'description': 'hsas00-2118-20150204-1000et-07\n\n\nUSHR07 Armed Services Committee',
+            'timestamp': 1423060374,
+            'upload_date': '20150204',
+            'uploader': 'HouseCommittee',
+            'uploader_id': '12987475',
+        },
    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
        video_type = None
        webpage = self._download_webpage(url, video_id)
+
+        ustream_url = UstreamIE._extract_url(webpage)
+        if ustream_url:
+            return self.url_result(ustream_url, UstreamIE.ie_key())
+
        # We first look for clipid, because clipprog always appears before
        patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')]
        results = list(filter(None, (re.search(p, webpage) for p in patterns)))
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -77,6 +77,10 @@ from .awaan import (
    AWAANLiveIE,
    AWAANSeasonIE,
 )
+from .azmedien import (
+    AZMedienIE,
+    AZMedienShowIE,
+)
 from .azubu import AzubuIE, AzubuLiveIE
 from .baidu import BaiduVideoIE
 from .bambuser import BambuserIE, BambuserChannelIE
@@ -594,6 +598,7 @@ from .nextmedia import (
    NextMediaIE,
    NextMediaActionNewsIE,
    AppleDailyIE,
+    NextTVIE,
 )
 from .nfb import NFBIE
 from .nfl import NFLIE
@@ -720,6 +725,7 @@ from .polskieradio import (
 )
 from .porn91 import Porn91IE
 from .porncom import PornComIE
+from .pornflip import PornFlipIE
 from .pornhd import PornHdIE
 from .pornhub import (
    PornHubIE,
--- a/youtube_dl/extractor/firsttv.py
+++ b/youtube_dl/extractor/firsttv.py
@@ -86,18 +86,43 @@ class FirstTVIE(InfoExtractor):
            title = item['title']
            quality = qualities(QUALITIES)
            formats = []
+            path = None
            for f in item.get('mbr', []):
                src = f.get('src')
                if not src or not isinstance(src, compat_str):
                    continue
                tbr = int_or_none(self._search_regex(
                    r'_(\d{3,})\.mp4', src, 'tbr', default=None))
+                if not path:
+                    path = self._search_regex(
+                        r'//[^/]+/(.+?)_\d+\.mp4', src,
+                        'm3u8 path', default=None)
                formats.append({
                    'url': src,
                    'format_id': f.get('name'),
                    'tbr': tbr,
-                    'quality': quality(f.get('name')),
+                    'source_preference': quality(f.get('name')),
                })
+            # m3u8 URL format is reverse engineered from [1] (search for
+            # master.m3u8). dashEdges (that is currently balancer-vod.1tv.ru)
+            # is taken from [2].
+            # 1. http://static.1tv.ru/player/eump1tv-current/eump-1tv.all.min.js?rnd=9097422834:formatted
+            # 2. http://static.1tv.ru/player/eump1tv-config/config-main.js?rnd=9097422834
+            if not path and len(formats) == 1:
+                path = self._search_regex(
+                    r'//[^/]+/(.+?$)', formats[0]['url'],
+                    'm3u8 path', default=None)
+            if path:
+                if len(formats) == 1:
+                    m3u8_path = ','
+                else:
+                    tbrs = [compat_str(t) for t in sorted(f['tbr'] for f in formats)]
+                    m3u8_path = '_,%s,%s' % (','.join(tbrs), '.mp4')
+                formats.extend(self._extract_m3u8_formats(
+                    'http://balancer-vod.1tv.ru/%s%s.urlset/master.m3u8'
+                    % (path, m3u8_path),
+                    display_id, 'mp4',
+                    entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
            self._sort_formats(formats)

            thumbnail = item.get('poster') or self._og_search_thumbnail(webpage)
--- a/youtube_dl/extractor/flipagram.py
+++ b/youtube_dl/extractor/flipagram.py
@@ -81,7 +81,7 @@ class FlipagramIE(InfoExtractor):
            'filesize': int_or_none(cover.get('size')),
        } for cover in flipagram.get('covers', []) if cover.get('url')]

-        # Note that this only retrieves comments that are initally loaded.
+        # Note that this only retrieves comments that are initially loaded.
        # For videos with large amounts of comments, most won't be retrieved.
        comments = []
        for comment in video_data.get('comments', {}).get(video_id, {}).get('items', []):
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -79,6 +79,7 @@ from .dbtv import DBTVIE
 from .piksel import PikselIE
 from .videa import VideaIE
 from .twentymin import TwentyMinutenIE
+from .ustream import UstreamIE


 class GenericIE(InfoExtractor):
@@ -588,17 +589,6 @@ class GenericIE(InfoExtractor):
                'description': 'md5:8145d19d320ff3e52f28401f4c4283b9',
            }
        },
-        # Embedded Ustream video
-        {
-            'url': 'http://www.american.edu/spa/pti/nsa-privacy-janus-2014.cfm',
-            'md5': '27b99cdb639c9b12a79bca876a073417',
-            'info_dict': {
-                'id': '45734260',
-                'ext': 'flv',
-                'uploader': 'AU SPA:  The NSA and Privacy',
-                'title': 'NSA and Privacy Forum Debate featuring General Hayden and Barton Gellman'
-            }
-        },
        # nowvideo embed hidden behind percent encoding
        {
            'url': 'http://www.waoanime.tv/the-super-dimension-fortress-macross-episode-1/',
@@ -2112,10 +2102,9 @@ class GenericIE(InfoExtractor):
            return self.url_result(mobj.group('url'), 'TED')

        # Look for embedded Ustream videos
-        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>http://www\.ustream\.tv/embed/.+?)\1', webpage)
-        if mobj is not None:
-            return self.url_result(mobj.group('url'), 'Ustream')
+        ustream_url = UstreamIE._extract_url(webpage)
+        if ustream_url:
+            return self.url_result(ustream_url, UstreamIE.ie_key())

        # Look for embedded arte.tv player
        mobj = re.search(
--- a/youtube_dl/extractor/mtv.py
+++ b/youtube_dl/extractor/mtv.py
@@ -211,7 +211,7 @@ class MTVServicesInfoExtractor(InfoExtractor):

    def _extract_triforce_mgid(self, webpage, data_zone=None, video_id=None):
        triforce_feed = self._parse_json(self._search_regex(
-            r'triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage,
+            r'triforceManifestFeed\s*=\s*({.+?})\s*;\s*\n', webpage,
            'triforce feed', default='{}'), video_id, fatal=False)

        data_zone = self._search_regex(
--- a/youtube_dl/extractor/nextmedia.py
+++ b/youtube_dl/extractor/nextmedia.py
@@ -2,7 +2,15 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..utils import parse_iso8601
+from ..compat import compat_urlparse
+from ..utils import (
+    clean_html,
+    get_element_by_class,
+    int_or_none,
+    parse_iso8601,
+    remove_start,
+    unified_timestamp,
+)


 class NextMediaIE(InfoExtractor):
@@ -30,6 +38,12 @@ class NextMediaIE(InfoExtractor):
        return self._extract_from_nextmedia_page(news_id, url, page)

    def _extract_from_nextmedia_page(self, news_id, url, page):
+        redirection_url = self._search_regex(
+            r'window\.location\.href\s*=\s*([\'"])(?P<url>(?!\1).+)\1',
+            page, 'redirection URL', default=None, group='url')
+        if redirection_url:
+            return self.url_result(compat_urlparse.urljoin(url, redirection_url))
+
        title = self._fetch_title(page)
        video_url = self._search_regex(self._URL_PATTERN, page, 'video url')

@@ -93,7 +107,7 @@ class NextMediaActionNewsIE(NextMediaIE):

 class AppleDailyIE(NextMediaIE):
    IE_DESC = '臺灣蘋果日報'
-    _VALID_URL = r'https?://(www|ent)\.appledaily\.com\.tw/(?:animation|appledaily|enews|realtimenews|actionnews)/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
+    _VALID_URL = r'https?://(www|ent)\.appledaily\.com\.tw/[^/]+/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
    _TESTS = [{
        'url': 'http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694',
        'md5': 'a843ab23d150977cc55ef94f1e2c1e4d',
@@ -157,6 +171,10 @@ class AppleDailyIE(NextMediaIE):
    }, {
        'url': 'http://www.appledaily.com.tw/actionnews/appledaily/7/20161003/960588/',
        'only_matching': True,
+    }, {
+        # Redirected from http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694
+        'url': 'http://ent.appledaily.com.tw/section/article/headline/20150128/36354694',
+        'only_matching': True,
    }]

    _URL_PATTERN = r'\{url: \'(.+)\'\}'
@@ -173,3 +191,48 @@ class AppleDailyIE(NextMediaIE):

    def _fetch_description(self, page):
        return self._html_search_meta('description', page, 'news description')
+
+
+class NextTVIE(InfoExtractor):
+    IE_DESC = '壹電視'
+    _VALID_URL = r'https?://(?:www\.)?nexttv\.com\.tw/(?:[^/]+/)+(?P<id>\d+)'
+
+    _TEST = {
+        'url': 'http://www.nexttv.com.tw/news/realtime/politics/11779671',
+        'info_dict': {
+            'id': '11779671',
+            'ext': 'mp4',
+            'title': '「超收稅」近4千億！　藍議員籲發消費券',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'timestamp': 1484825400,
+            'upload_date': '20170119',
+            'view_count': int,
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        title = self._html_search_regex(
+            r'<h1[^>]*>([^<]+)</h1>', webpage, 'title')
+
+        data = self._hidden_inputs(webpage)
+
+        video_url = data['ntt-vod-src-detailview']
+
+        date_str = get_element_by_class('date', webpage)
+        timestamp = unified_timestamp(date_str + '+0800') if date_str else None
+
+        view_count = int_or_none(remove_start(
+            clean_html(get_element_by_class('click', webpage)), '點閱：'))
+
+        return {
+            'id': video_id,
+            'title': title,
+            'url': video_url,
+            'thumbnail': data.get('ntt-vod-img-src'),
+            'timestamp': timestamp,
+            'view_count': view_count,
+        }
--- a/youtube_dl/extractor/pluralsight.py
+++ b/youtube_dl/extractor/pluralsight.py
@@ -157,13 +157,10 @@ class PluralsightIE(PluralsightBaseIE):

        display_id = '%s-%s' % (name, clip_id)

-        parsed_url = compat_urlparse.urlparse(url)
-
-        payload_url = compat_urlparse.urlunparse(parsed_url._replace(
-            netloc='app.pluralsight.com', path='player/api/v1/payload'))
-
        course = self._download_json(
-            payload_url, display_id, headers={'Referer': url})['payload']['course']
+            'https://app.pluralsight.com/player/user/api/v1/player/payload',
+            display_id, data=urlencode_postdata({'courseId': course_name}),
+            headers={'Referer': url})

        collection = course['modules']

--- a/youtube_dl/extractor/pornflip.py
+++ b/youtube_dl/extractor/pornflip.py
@@ -0,0 +1,92 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import (
+    compat_parse_qs,
+    compat_str,
+)
+from ..utils import (
+    int_or_none,
+    try_get,
+    unified_timestamp,
+)
+
+
+class PornFlipIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?pornflip\.com/(?:v|embed)/(?P<id>[0-9A-Za-z]{11})'
+    _TESTS = [{
+        'url': 'https://www.pornflip.com/v/wz7DfNhMmep',
+        'md5': '98c46639849145ae1fd77af532a9278c',
+        'info_dict': {
+            'id': 'wz7DfNhMmep',
+            'ext': 'mp4',
+            'title': '2 Amateurs swallow make his dream cumshots true',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'duration': 112,
+            'timestamp': 1481655502,
+            'upload_date': '20161213',
+            'uploader_id': '106786',
+            'uploader': 'figifoto',
+            'view_count': int,
+            'age_limit': 18,
+        }
+    }, {
+        'url': 'https://www.pornflip.com/embed/wz7DfNhMmep',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(
+            'https://www.pornflip.com/v/%s' % video_id, video_id)
+
+        flashvars = compat_parse_qs(self._search_regex(
+            r'<embed[^>]+flashvars=(["\'])(?P<flashvars>(?:(?!\1).)+)\1',
+            webpage, 'flashvars', group='flashvars'))
+
+        title = flashvars['video_vars[title]'][0]
+
+        def flashvar(kind):
+            return try_get(
+                flashvars, lambda x: x['video_vars[%s]' % kind][0], compat_str)
+
+        formats = []
+        for key, value in flashvars.items():
+            if not (value and isinstance(value, list)):
+                continue
+            format_url = value[0]
+            if key == 'video_vars[hds_manifest]':
+                formats.extend(self._extract_mpd_formats(
+                    format_url, video_id, mpd_id='dash', fatal=False))
+                continue
+            height = self._search_regex(
+                r'video_vars\[video_urls\]\[(\d+)', key, 'height', default=None)
+            if not height:
+                continue
+            formats.append({
+                'url': format_url,
+                'format_id': 'http-%s' % height,
+                'height': int_or_none(height),
+            })
+        self._sort_formats(formats)
+
+        uploader = self._html_search_regex(
+            (r'<span[^>]+class="name"[^>]*>\s*<a[^>]+>\s*<strong>(?P<uploader>[^<]+)',
+             r'<meta[^>]+content=(["\'])[^>]*\buploaded by (?P<uploader>.+?)\1'),
+            webpage, 'uploader', fatal=False, group='uploader')
+
+        return {
+            'id': video_id,
+            'formats': formats,
+            'title': title,
+            'thumbnail': flashvar('big_thumb'),
+            'duration': int_or_none(flashvar('duration')),
+            'timestamp': unified_timestamp(self._html_search_meta(
+                'uploadDate', webpage, 'timestamp')),
+            'uploader_id': flashvar('author_id'),
+            'uploader': uploader,
+            'view_count': int_or_none(flashvar('views')),
+            'age_limit': 18,
+        }
--- a/youtube_dl/extractor/twentyfourvideo.py
+++ b/youtube_dl/extractor/twentyfourvideo.py
@@ -12,7 +12,7 @@ from ..utils import (

 class TwentyFourVideoIE(InfoExtractor):
    IE_NAME = '24video'
-    _VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
+    _VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx|sex)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'

    _TESTS = [{
        'url': 'http://www.24video.net/video/view/1044982',
@@ -43,7 +43,7 @@ class TwentyFourVideoIE(InfoExtractor):
        video_id = self._match_id(url)

        webpage = self._download_webpage(
-            'http://www.24video.net/video/view/%s' % video_id, video_id)
+            'http://www.24video.sex/video/view/%s' % video_id, video_id)

        title = self._og_search_title(webpage)
        description = self._html_search_regex(
@@ -69,11 +69,11 @@ class TwentyFourVideoIE(InfoExtractor):

        # Sets some cookies
        self._download_xml(
-            r'http://www.24video.net/video/xml/%s?mode=init' % video_id,
+            r'http://www.24video.sex/video/xml/%s?mode=init' % video_id,
            video_id, 'Downloading init XML')

        video_xml = self._download_xml(
-            'http://www.24video.net/video/xml/%s?mode=play' % video_id,
+            'http://www.24video.sex/video/xml/%s?mode=play' % video_id,
            video_id, 'Downloading video XML')

        video = xpath_element(video_xml, './/video', 'video', fatal=True)
--- a/youtube_dl/extractor/uol.py
+++ b/youtube_dl/extractor/uol.py
@@ -84,12 +84,27 @@ class UOLIE(InfoExtractor):

    def _real_extract(self, url):
        video_id = self._match_id(url)
-        if not video_id.isdigit():
-            embed_page = self._download_webpage('https://jsuol.com.br/c/tv/uol/embed/?params=[embed,%s]' % video_id, video_id)
-            video_id = self._search_regex(r'mediaId=(\d+)', embed_page, 'media id')
+        media_id = None
+
+        if video_id.isdigit():
+            media_id = video_id
+
+        if not media_id:
+            embed_page = self._download_webpage(
+                'https://jsuol.com.br/c/tv/uol/embed/?params=[embed,%s]' % video_id,
+                video_id, 'Downloading embed page', fatal=False)
+            if embed_page:
+                media_id = self._search_regex(
+                    (r'uol\.com\.br/(\d+)', r'mediaId=(\d+)'),
+                    embed_page, 'media id', default=None)
+
+        if not media_id:
+            webpage = self._download_webpage(url, video_id)
+            media_id = self._search_regex(r'mediaId=(\d+)', webpage, 'media id')
+
        video_data = self._download_json(
-            'http://mais.uol.com.br/apiuol/v3/player/getMedia/%s.json' % video_id,
-            video_id)['item']
+            'http://mais.uol.com.br/apiuol/v3/player/getMedia/%s.json' % media_id,
+            media_id)['item']
        title = video_data['title']

        query = {
@@ -118,7 +133,7 @@ class UOLIE(InfoExtractor):
            tags.append(tag_description)

        return {
-            'id': video_id,
+            'id': media_id,
            'title': title,
            'description': clean_html(video_data.get('desMedia')),
            'thumbnail': video_data.get('thumbnail'),
--- a/youtube_dl/extractor/ustream.py
+++ b/youtube_dl/extractor/ustream.py
@@ -69,6 +69,13 @@ class UstreamIE(InfoExtractor):
        },
    }]

+    @staticmethod
+    def _extract_url(webpage):
+        mobj = re.search(
+            r'<iframe[^>]+?src=(["\'])(?P<url>http://www\.ustream\.tv/embed/.+?)\1', webpage)
+        if mobj is not None:
+            return mobj.group('url')
+
    def _get_stream_info(self, url, video_id, app_id_ver, extra_note=None):
        def num_to_hex(n):
            return hex(n)[2:]
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@@ -338,7 +338,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
            'expected_warnings': ['Unable to download JSON metadata'],
        },
        {
-            # redirects to ondemand extractor and should be passed throught it
+            # redirects to ondemand extractor and should be passed through it
            # for successful extraction
            'url': 'https://vimeo.com/73445910',
            'info_dict': {
@@ -730,12 +730,12 @@ class VimeoChannelIE(VimeoBaseInfoExtractor):
            # Try extracting href first since not all videos are available via
            # short https://vimeo.com/id URL (e.g. https://vimeo.com/channels/tributes/6213729)
            clips = re.findall(
-                r'id="clip_(\d+)"[^>]*>\s*<a[^>]+href="(/(?:[^/]+/)*\1)', webpage)
+                r'id="clip_(\d+)"[^>]*>\s*<a[^>]+href="(/(?:[^/]+/)*\1)(?:[^>]+\btitle="([^"]+)")?', webpage)
            if clips:
-                for video_id, video_url in clips:
+                for video_id, video_url, video_title in clips:
                    yield self.url_result(
                        compat_urlparse.urljoin(base_url, video_url),
-                        VimeoIE.ie_key(), video_id=video_id)
+                        VimeoIE.ie_key(), video_id=video_id, video_title=video_title)
            # More relaxed fallback
            else:
                for video_id in re.findall(r'id=["\']clip_(\d+)', webpage):
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -864,6 +864,30 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                'skip_download': True,
            },
        },
+        {
+            # YouTube Red video with episode data
+            'url': 'https://www.youtube.com/watch?v=iqKdEhx-dD4',
+            'info_dict': {
+                'id': 'iqKdEhx-dD4',
+                'ext': 'mp4',
+                'title': 'Isolation - Mind Field (Ep 1)',
+                'description': 'md5:3a72f23c086a1496c9e2c54a25fa0822',
+                'upload_date': '20170118',
+                'uploader': 'Vsauce',
+                'uploader_id': 'Vsauce',
+                'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/Vsauce',
+                'license': 'Standard YouTube License',
+                'series': 'Mind Field',
+                'season_number': 1,
+                'episode_number': 1,
+            },
+            'params': {
+                'skip_download': True,
+            },
+            'expected_warnings': [
+                'Skipping DASH manifest',
+            ],
+        },
        {
            # itag 212
            'url': '1t24XAntNCY',
@@ -1454,6 +1478,16 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
        else:
            video_alt_title = video_creator = None

+        m_episode = re.search(
+            r'<div[^>]+id="watch7-headline"[^>]*>\s*<span[^>]*>.*?>(?P<series>[^<]+)</a></b>\s*S(?P<season>\d+)\s*•\s*E(?P<episode>\d+)</span>',
+            video_webpage)
+        if m_episode:
+            series = m_episode.group('series')
+            season_number = int(m_episode.group('season'))
+            episode_number = int(m_episode.group('episode'))
+        else:
+            series = season_number = episode_number = None
+
        m_cat_container = self._search_regex(
            r'(?s)<h4[^>]*>\s*Category\s*</h4>\s*<ul[^>]*>(.*?)</ul>',
            video_webpage, 'categories', default=None)
@@ -1743,6 +1777,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            'is_live': is_live,
            'start_time': start_time,
            'end_time': end_time,
+            'series': series,
+            'season_number': season_number,
+            'episode_number': episode_number,
        }


@@ -1819,6 +1856,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
            'title': 'YDL_Empty_List',
        },
        'playlist_count': 0,
+        'skip': 'This playlist is private',
    }, {
        'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
        'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
@@ -1850,6 +1888,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
            'id': 'PLtPgu7CB4gbY9oDN3drwC3cMbJggS7dKl',
        },
        'playlist_count': 2,
+        'skip': 'This playlist is private',
    }, {
        'note': 'embedded',
        'url': 'https://www.youtube.com/embed/videoseries?list=PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
@@ -1961,14 +2000,18 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
        url = self._TEMPLATE_URL % playlist_id
        page = self._download_webpage(url, playlist_id)

-        for match in re.findall(r'<div class="yt-alert-message">([^<]+)</div>', page):
+        # the yt-alert-message now has tabindex attribute (see https://github.com/rg3/youtube-dl/issues/11604)
+        for match in re.findall(r'<div class="yt-alert-message"[^>]*>([^<]+)</div>', page):
            match = match.strip()
            # Check if the playlist exists or is private
-            if re.match(r'[^<]*(The|This) playlist (does not exist|is private)[^<]*', match):
-                raise ExtractorError(
-                    'The playlist doesn\'t exist or is private, use --username or '
-                    '--netrc to access it.',
-                    expected=True)
+            mobj = re.match(r'[^<]*(?:The|This) playlist (?P<reason>does not exist|is private)[^<]*', match)
+            if mobj:
+                reason = mobj.group('reason')
+                message = 'This playlist %s' % reason
+                if 'private' in reason:
+                    message += ', use --username or --netrc to access it'
+                message += '.'
+                raise ExtractorError(message, expected=True)
            elif re.match(r'[^<]*Invalid parameters[^<]*', match):
                raise ExtractorError(
                    'Invalid parameters. Maybe URL is incorrect.',
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -751,7 +751,7 @@ def parseOpts(overrideArguments=None):
        help='Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
    postproc.add_option(
        '--audio-format', metavar='FORMAT', dest='audioformat', default='best',
-        help='Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default')
+        help='Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default; No effect without -x')
    postproc.add_option(
        '--audio-quality', metavar='QUALITY',
        dest='audioquality', default='5',
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -143,6 +143,7 @@ DATE_FORMATS = (
    '%Y/%m/%d',
    '%Y/%m/%d %H:%M',
    '%Y/%m/%d %H:%M:%S',
+    '%Y-%m-%d %H:%M',
    '%Y-%m-%d %H:%M:%S',
    '%Y-%m-%d %H:%M:%S.%f',
    '%d.%m.%Y %H:%M',
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2017.01.18'
+__version__ = '2017.01.24'
Author	SHA1	Message	Date
Sergey M․	c3a65c3de0	release 2017.01.24	2017-01-24 02:58:37 +07:00
Sergey M․	ee4c091ce5	[ChangeLog] Actualize	2017-01-24 02:56:19 +07:00
Sergey M․	b494d6856c	[pluralsight] Fix extraction (closes #11820 )	2017-01-24 02:51:45 +07:00
Yen Chi Hsuan	bc35ed3fb6	[nextmedia] Add support for NextTV (壹電視)	2017-01-23 23:38:31 +08:00
Yen Chi Hsuan	0c1c6f4b9f	[utils] Add another date format seen in NextTV	2017-01-23 23:37:32 +08:00
Sergey M․	6d119c2a6b	[24video] Fix extraction (closes #11811 )	2017-01-23 03:50:39 +07:00
Sergey M․	4201ba13e6	[youtube:playlist] Fix nonexistent/private playlist detection and skip private tests	2017-01-23 02:49:56 +07:00
Grzegorz P	8bc0800d7c	[youtube:playlist] Fix nonexistent/private playlist detection (closes #11604 )	2017-01-23 02:35:38 +07:00
Alex Seiler	a089545e03	[azmedien:show] Improve _VALID_URL	2017-01-23 02:30:29 +07:00
Gaetan Gilbert	30dda24de3	[chirbit] Extract uploader	2017-01-23 02:27:38 +07:00
Sergey M․	9d5b29c881	release 2017.01.22	2017-01-22 18:59:04 +07:00
Sergey M․	6c031a35f3	[ChangeLog] Actualize	2017-01-22 18:57:15 +07:00
Sergey M․	271808b6b2	[pornflip] Improve and extract dash formats (closes #11795 )	2017-01-22 03:43:27 +07:00
einstein95	8d1fbe0cb2	[pornflip] Add extractor (closes #11556 )	2017-01-22 03:41:59 +07:00
Sergey M․	a243abb80d	[chaturbate] Improve (closes #11797 )	2017-01-22 03:02:48 +07:00
einstein95	42697bab3c	[chaturbate] Fix extraction	2017-01-22 02:58:40 +07:00
Sergey M․	94629e537f	[azmedien] Improve (closes #11784 )	2017-01-22 02:17:39 +07:00
Alex Seiler	e84495cd8d	[azmedien] Add extractor (closes #11785 )	2017-01-22 02:17:39 +07:00
Yen Chi Hsuan	7c20b7484c	[nextmedia] Support redirected URLs	2017-01-22 02:06:34 +08:00
ha shao	04a3d4d234	[vimeo:channel] Extract videos' titles for playlist entries	2017-01-21 23:37:44 +07:00
Sergey M․	12afdc2ad6	[youtube] Extract episode metadata (closes #9695 , closes #11774 )	2017-01-21 18:10:32 +07:00
Iulian Onofrei	f4ec8dce48	Update README.md (#11787 ) Add audio format argument dependency warning	2017-01-21 00:25:04 +08:00
Yen Chi Hsuan	f3c21cb7a7	[cspan] Fix _TESTS	2017-01-20 22:27:13 +08:00
Yen Chi Hsuan	972efe60c3	[generic] Remove a dead test The web page does not contain a video anymore Ref: #2694, #2696	2017-01-20 22:27:13 +08:00
Yen Chi Hsuan	4447fb2332	[cspan] Support Ustream embedded videos Closes #11547	2017-01-20 22:27:13 +08:00
Yen Chi Hsuan	d77ac73790	[ustream] Add UstreamIE._extract_url() Ref: #11547	2017-01-20 22:27:13 +08:00
Sergey M․	1fe84be0f3	[1tv] Add support for hls (closes #11786 )	2017-01-20 00:47:04 +07:00
Yen Chi Hsuan	1076858f76	Merge pull request #11778 from h4ck3rm1k3/master Fix typos	2017-01-19 19:32:29 +08:00
james mike dupont	cccd70a275	untie	2017-01-19 04:18:13 -05:00
Sergey M․	eb3f008c9e	[uol] Fix extraction (closes #11770 )	2017-01-19 04:49:31 +07:00
Sergey M․	f1e70fc2ff	[mtv] Relax triforce feed regex (closes #11766 )	2017-01-18 23:34:11 +07:00