Compare commits
20 Commits
2016.12.31
...
2017.01.05
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7232bb299b | ||
|
|
2b12e34076 | ||
|
|
fb47cb5b23 | ||
|
|
b6de53ea8a | ||
|
|
96d315c2be | ||
|
|
1911d77d28 | ||
|
|
027e231295 | ||
|
|
7a9e066972 | ||
|
|
2021b650dd | ||
|
|
b890caaf21 | ||
|
|
3783a5ccba | ||
|
|
327caf661a | ||
|
|
ce7ccb1caa | ||
|
|
295eac6165 | ||
|
|
d546d4c8e0 | ||
|
|
eec45445a8 | ||
|
|
7fc06b6a15 | ||
|
|
966815e139 | ||
|
|
e5e19379be | ||
|
|
1f766b6e7b |
8
.github/ISSUE_TEMPLATE.md
vendored
8
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.12.22*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.12.22**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.05*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.05**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.12.22
|
||||
[debug] youtube-dl version 2017.01.05
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
@@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
|
||||
- Single video: https://youtu.be/BaW_jenozKc
|
||||
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
|
||||
|
||||
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
|
||||
---
|
||||
|
||||
### Description of your *issue*, suggested solution and other information
|
||||
|
||||
2
.github/ISSUE_TEMPLATE_tmpl.md
vendored
2
.github/ISSUE_TEMPLATE_tmpl.md
vendored
@@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
|
||||
- Single video: https://youtu.be/BaW_jenozKc
|
||||
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
|
||||
|
||||
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
|
||||
---
|
||||
|
||||
### Description of your *issue*, suggested solution and other information
|
||||
|
||||
@@ -58,7 +58,7 @@ We are then presented with a very complicated request when the original problem
|
||||
|
||||
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
|
||||
|
||||
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||
|
||||
### Is anyone going to need the feature?
|
||||
|
||||
@@ -94,7 +94,7 @@ If you want to create a build of youtube-dl yourself, you'll need
|
||||
|
||||
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
|
||||
|
||||
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||
|
||||
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
|
||||
2. Check out the source code with:
|
||||
@@ -199,7 +199,7 @@ Assume at this point `meta`'s layout is:
|
||||
}
|
||||
```
|
||||
|
||||
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
|
||||
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
|
||||
|
||||
```python
|
||||
description = meta.get('summary') # correct
|
||||
|
||||
17
ChangeLog
17
ChangeLog
@@ -1,3 +1,20 @@
|
||||
version 2017.01.05
|
||||
|
||||
Extractors
|
||||
+ [zdf] Fix extraction (#11055, #11063)
|
||||
* [pornhub:playlist] Improve extraction (#11594)
|
||||
+ [cctv] Add support for ncpa-classic.com (#11591)
|
||||
+ [tunein] Add support for embeds (#11579)
|
||||
|
||||
|
||||
version 2017.01.02
|
||||
|
||||
Extractors
|
||||
* [cctv] Improve extraction (#879, #6753, #8541)
|
||||
+ [nrktv:episodes] Add support for episodes (#11571)
|
||||
+ [arkena] Add support for video.arkena.com (#11568)
|
||||
|
||||
|
||||
version 2016.12.31
|
||||
|
||||
Core
|
||||
|
||||
@@ -29,7 +29,7 @@ Windows users can [download an .exe file](https://yt-dl.org/latest/youtube-dl.ex
|
||||
|
||||
You can also use pip:
|
||||
|
||||
sudo pip install --upgrade youtube-dl
|
||||
sudo -H pip install --upgrade youtube-dl
|
||||
|
||||
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
|
||||
|
||||
@@ -1148,7 +1148,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
|
||||
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
|
||||
```
|
||||
|
||||
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
|
||||
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
|
||||
|
||||
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
|
||||
|
||||
|
||||
@@ -132,7 +132,7 @@
|
||||
- **cbsnews:livevideo**: CBS News Live Videos
|
||||
- **CBSSports**
|
||||
- **CCMA**
|
||||
- **CCTV**
|
||||
- **CCTV**: 央视网
|
||||
- **CDA**
|
||||
- **CeskaTelevize**
|
||||
- **channel9**: Channel 9
|
||||
@@ -517,6 +517,7 @@
|
||||
- **NRKSkole**: NRK Skole
|
||||
- **NRKTV**: NRK TV and NRK Radio
|
||||
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
|
||||
- **NRKTVEpisodes**
|
||||
- **ntv.ru**
|
||||
- **Nuvid**
|
||||
- **NYTimes**
|
||||
|
||||
@@ -4,8 +4,10 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urlparse
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
mimetype2ext,
|
||||
@@ -15,7 +17,13 @@ from ..utils import (
|
||||
|
||||
|
||||
class ArkenaIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
video\.arkena\.com/play2/embed/player\?|
|
||||
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
|
||||
)
|
||||
'''
|
||||
_TESTS = [{
|
||||
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
|
||||
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
|
||||
@@ -37,6 +45,9 @@ class ArkenaIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://video.arkena.com/play2/embed/player?accountId=472718&mediaId=35763b3b-00090078-bf604299&pageStyling=styled',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
@@ -53,6 +64,14 @@ class ArkenaIE(InfoExtractor):
|
||||
video_id = mobj.group('id')
|
||||
account_id = mobj.group('account_id')
|
||||
|
||||
# Handle http://video.arkena.com/play2/embed/player URL
|
||||
if not video_id:
|
||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||
video_id = qs.get('mediaId', [None])[0]
|
||||
account_id = qs.get('accountId', [None])[0]
|
||||
if not video_id or not account_id:
|
||||
raise ExtractorError('Invalid URL', expected=True)
|
||||
|
||||
playlist = self._download_json(
|
||||
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
|
||||
% (video_id, account_id),
|
||||
|
||||
@@ -4,50 +4,188 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import float_or_none
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
float_or_none,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
)
|
||||
|
||||
|
||||
class CCTVIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:.+?\.)?
|
||||
(?:
|
||||
cctv\.(?:com|cn)|
|
||||
cntv\.cn
|
||||
)/
|
||||
(?:
|
||||
video/[^/]+/(?P<id>[0-9a-f]{32})|
|
||||
\d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
|
||||
)'''
|
||||
IE_DESC = '央视网'
|
||||
_VALID_URL = r'https?://(?:(?:[^/]+)\.(?:cntv|cctv)\.(?:com|cn)|(?:www\.)?ncpa-classic\.com)/(?:[^/]+/)*?(?P<id>[^/?#&]+?)(?:/index)?(?:\.s?html|[?#&]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
|
||||
'md5': '819c7b49fc3927d529fb4cd555621823',
|
||||
# fo.addVariable("videoCenterId","id")
|
||||
'url': 'http://sports.cntv.cn/2016/02/12/ARTIaBRxv4rTT1yWf1frW2wi160212.shtml',
|
||||
'md5': 'd61ec00a493e09da810bf406a078f691',
|
||||
'info_dict': {
|
||||
'id': '454368eb19ad44a1925bf1eb96140a61',
|
||||
'id': '5ecdbeab623f4973b40ff25f18b174e8',
|
||||
'ext': 'mp4',
|
||||
'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
|
||||
}
|
||||
'title': '[NBA]二少联手砍下46分 雷霆主场击败鹈鹕(快讯)',
|
||||
'description': 'md5:7e14a5328dc5eb3d1cd6afbbe0574e95',
|
||||
'duration': 98,
|
||||
'uploader': 'songjunjie',
|
||||
'timestamp': 1455279956,
|
||||
'upload_date': '20160212',
|
||||
},
|
||||
}, {
|
||||
# var guid = "id"
|
||||
'url': 'http://tv.cctv.com/2016/02/05/VIDEUS7apq3lKrHG9Dncm03B160205.shtml',
|
||||
'info_dict': {
|
||||
'id': 'efc5d49e5b3b4ab2b34f3a502b73d3ae',
|
||||
'ext': 'mp4',
|
||||
'title': '[赛车]“车王”舒马赫恢复情况成谜(快讯)',
|
||||
'description': '2月4日,蒙特泽莫罗透露了关于“车王”舒马赫恢复情况,但情况是否属实遭到了质疑。',
|
||||
'duration': 37,
|
||||
'uploader': 'shujun',
|
||||
'timestamp': 1454677291,
|
||||
'upload_date': '20160205',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# changePlayer('id')
|
||||
'url': 'http://english.cntv.cn/special/four_comprehensives/index.shtml',
|
||||
'info_dict': {
|
||||
'id': '4bb9bb4db7a6471ba85fdeda5af0381e',
|
||||
'ext': 'mp4',
|
||||
'title': 'NHnews008 ANNUAL POLITICAL SEASON',
|
||||
'description': 'Four Comprehensives',
|
||||
'duration': 60,
|
||||
'uploader': 'zhangyunlei',
|
||||
'timestamp': 1425385521,
|
||||
'upload_date': '20150303',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# loadvideo('id')
|
||||
'url': 'http://cctv.cntv.cn/lm/tvseries_russian/yilugesanghua/index.shtml',
|
||||
'info_dict': {
|
||||
'id': 'b15f009ff45c43968b9af583fc2e04b2',
|
||||
'ext': 'mp4',
|
||||
'title': 'Путь,усыпанный космеями Серия 1',
|
||||
'description': 'Путь, усыпанный космеями',
|
||||
'duration': 2645,
|
||||
'uploader': 'renxue',
|
||||
'timestamp': 1477479241,
|
||||
'upload_date': '20161026',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# var initMyAray = 'id'
|
||||
'url': 'http://www.ncpa-classic.com/2013/05/22/VIDE1369219508996867.shtml',
|
||||
'info_dict': {
|
||||
'id': 'a194cfa7f18c426b823d876668325946',
|
||||
'ext': 'mp4',
|
||||
'title': '小泽征尔音乐塾 音乐梦想无国界',
|
||||
'duration': 2173,
|
||||
'timestamp': 1369248264,
|
||||
'upload_date': '20130522',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# var ids = ["id"]
|
||||
'url': 'http://www.ncpa-classic.com/clt/more/416/index.shtml',
|
||||
'info_dict': {
|
||||
'id': 'a8606119a4884588a79d81c02abecc16',
|
||||
'ext': 'mp3',
|
||||
'title': '来自维也纳的新年贺礼',
|
||||
'description': 'md5:f13764ae8dd484e84dd4b39d5bcba2a7',
|
||||
'duration': 1578,
|
||||
'uploader': 'djy',
|
||||
'timestamp': 1482942419,
|
||||
'upload_date': '20161228',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['Failed to download m3u8 information'],
|
||||
}, {
|
||||
'url': 'http://ent.cntv.cn/2016/01/18/ARTIjprSSJH8DryTVr5Bx8Wb160118.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cntv.cn/video/C39296/e0210d949f113ddfb38d31f00a4e5c44',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
|
||||
'only_matching': True
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id, display_id = re.match(self._VALID_URL, url).groups()
|
||||
if not video_id:
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
|
||||
webpage, 'video_id')
|
||||
api_data = self._download_json(
|
||||
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
|
||||
m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
[r'var\s+guid\s*=\s*["\']([\da-fA-F]+)',
|
||||
r'videoCenterId["\']\s*,\s*["\']([\da-fA-F]+)',
|
||||
r'changePlayer\s*\(\s*["\']([\da-fA-F]+)',
|
||||
r'load[Vv]ideo\s*\(\s*["\']([\da-fA-F]+)',
|
||||
r'var\s+initMyAray\s*=\s*["\']([\da-fA-F]+)',
|
||||
r'var\s+ids\s*=\s*\[["\']([\da-fA-F]+)'],
|
||||
webpage, 'video id')
|
||||
|
||||
data = self._download_json(
|
||||
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do', video_id,
|
||||
query={
|
||||
'pid': video_id,
|
||||
'url': url,
|
||||
'idl': 32,
|
||||
'idlr': 32,
|
||||
'modifyed': 'false',
|
||||
})
|
||||
|
||||
title = data['title']
|
||||
|
||||
formats = []
|
||||
|
||||
video = data.get('video')
|
||||
if isinstance(video, dict):
|
||||
for quality, chapters_key in enumerate(('lowChapters', 'chapters')):
|
||||
video_url = try_get(
|
||||
video, lambda x: x[chapters_key][0]['url'], compat_str)
|
||||
if video_url:
|
||||
formats.append({
|
||||
'url': video_url,
|
||||
'format_id': 'http',
|
||||
'quality': quality,
|
||||
'preference': -1,
|
||||
})
|
||||
|
||||
hls_url = try_get(data, lambda x: x['hls_url'], compat_str)
|
||||
if hls_url:
|
||||
hls_url = re.sub(r'maxbr=\d+&?', '', hls_url)
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
hls_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
uploader = data.get('editer_name')
|
||||
description = self._html_search_meta(
|
||||
'description', webpage, default=None)
|
||||
timestamp = unified_timestamp(data.get('f_pgmtime'))
|
||||
duration = float_or_none(try_get(video, lambda x: x['totalLength']))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': api_data['title'],
|
||||
'formats': self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
|
||||
'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
|
||||
'title': title,
|
||||
'description': description,
|
||||
'uploader': uploader,
|
||||
'timestamp': timestamp,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
@@ -655,6 +655,7 @@ from .nrk import (
|
||||
NRKSkoleIE,
|
||||
NRKTVIE,
|
||||
NRKTVDirekteIE,
|
||||
NRKTVEpisodesIE,
|
||||
)
|
||||
from .ntvde import NTVDeIE
|
||||
from .ntvru import NTVRuIE
|
||||
|
||||
@@ -73,6 +73,7 @@ from .kaltura import KalturaIE
|
||||
from .eagleplatform import EaglePlatformIE
|
||||
from .facebook import FacebookIE
|
||||
from .soundcloud import SoundcloudIE
|
||||
from .tunein import TuneInBaseIE
|
||||
from .vbox7 import Vbox7IE
|
||||
from .dbtv import DBTVIE
|
||||
from .piksel import PikselIE
|
||||
@@ -828,6 +829,21 @@ class GenericIE(InfoExtractor):
|
||||
},
|
||||
'playlist_mincount': 7,
|
||||
},
|
||||
# TuneIn station embed
|
||||
{
|
||||
'url': 'http://radiocnrv.com/promouvoir-radio-cnrv/',
|
||||
'info_dict': {
|
||||
'id': '204146',
|
||||
'ext': 'mp3',
|
||||
'title': 'CNRV',
|
||||
'location': 'Paris, France',
|
||||
'is_live': True,
|
||||
},
|
||||
'params': {
|
||||
# Live stream
|
||||
'skip_download': True,
|
||||
},
|
||||
},
|
||||
# Livestream embed
|
||||
{
|
||||
'url': 'http://www.esa.int/Our_Activities/Space_Science/Rosetta/Philae_comet_touch-down_webcast',
|
||||
@@ -2088,6 +2104,11 @@ class GenericIE(InfoExtractor):
|
||||
if soundcloud_urls:
|
||||
return _playlist_from_matches(soundcloud_urls, getter=unescapeHTML, ie=SoundcloudIE.ie_key())
|
||||
|
||||
# Look for tunein player
|
||||
tunein_urls = TuneInBaseIE._extract_urls(webpage)
|
||||
if tunein_urls:
|
||||
return _playlist_from_matches(tunein_urls)
|
||||
|
||||
# Look for embedded mtvservices player
|
||||
mtvservices_url = MTVServicesEmbeddedIE._extract_url(webpage)
|
||||
if mtvservices_url:
|
||||
|
||||
@@ -207,7 +207,15 @@ class NRKIE(NRKBaseIE):
|
||||
|
||||
class NRKTVIE(NRKBaseIE):
|
||||
IE_DESC = 'NRK TV and NRK Radio'
|
||||
_VALID_URL = r'https?://(?:tv|radio)\.nrk(?:super)?\.no/(?:serie/[^/]+|program)/(?P<id>[a-zA-Z]{4}\d{8})(?:/\d{2}-\d{2}-\d{4})?(?:#del=(?P<part_id>\d+))?'
|
||||
_EPISODE_RE = r'(?P<id>[a-zA-Z]{4}\d{8})'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:tv|radio)\.nrk(?:super)?\.no/
|
||||
(?:serie/[^/]+|program)/
|
||||
(?![Ee]pisodes)%s
|
||||
(?:/\d{2}-\d{2}-\d{4})?
|
||||
(?:\#del=(?P<part_id>\d+))?
|
||||
''' % _EPISODE_RE
|
||||
_API_HOST = 'psapi-we.nrk.no'
|
||||
|
||||
_TESTS = [{
|
||||
@@ -286,9 +294,30 @@ class NRKTVDirekteIE(NRKTVIE):
|
||||
}]
|
||||
|
||||
|
||||
class NRKPlaylistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?nrk\.no/(?!video|skole)(?:[^/]+/)+(?P<id>[^/]+)'
|
||||
class NRKPlaylistBaseIE(InfoExtractor):
|
||||
def _extract_description(self, webpage):
|
||||
pass
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
entries = [
|
||||
self.url_result('nrk:%s' % video_id, NRKIE.ie_key())
|
||||
for video_id in re.findall(self._ITEM_RE, webpage)
|
||||
]
|
||||
|
||||
playlist_title = self. _extract_title(webpage)
|
||||
playlist_description = self._extract_description(webpage)
|
||||
|
||||
return self.playlist_result(
|
||||
entries, playlist_id, playlist_title, playlist_description)
|
||||
|
||||
|
||||
class NRKPlaylistIE(NRKPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?nrk\.no/(?!video|skole)(?:[^/]+/)+(?P<id>[^/]+)'
|
||||
_ITEM_RE = r'class="[^"]*\brich\b[^"]*"[^>]+data-video-id="([^"]+)"'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.nrk.no/troms/gjenopplev-den-historiske-solformorkelsen-1.12270763',
|
||||
'info_dict': {
|
||||
@@ -307,23 +336,28 @@ class NRKPlaylistIE(InfoExtractor):
|
||||
'playlist_count': 5,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
def _extract_title(self, webpage):
|
||||
return self._og_search_title(webpage, fatal=False)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
def _extract_description(self, webpage):
|
||||
return self._og_search_description(webpage)
|
||||
|
||||
entries = [
|
||||
self.url_result('nrk:%s' % video_id, 'NRK')
|
||||
for video_id in re.findall(
|
||||
r'class="[^"]*\brich\b[^"]*"[^>]+data-video-id="([^"]+)"',
|
||||
webpage)
|
||||
]
|
||||
|
||||
playlist_title = self._og_search_title(webpage)
|
||||
playlist_description = self._og_search_description(webpage)
|
||||
class NRKTVEpisodesIE(NRKPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://tv\.nrk\.no/program/[Ee]pisodes/[^/]+/(?P<id>\d+)'
|
||||
_ITEM_RE = r'data-episode=["\']%s' % NRKTVIE._EPISODE_RE
|
||||
_TESTS = [{
|
||||
'url': 'https://tv.nrk.no/program/episodes/nytt-paa-nytt/69031',
|
||||
'info_dict': {
|
||||
'id': '69031',
|
||||
'title': 'Nytt på nytt, sesong: 201210',
|
||||
},
|
||||
'playlist_count': 4,
|
||||
}]
|
||||
|
||||
return self.playlist_result(
|
||||
entries, playlist_id, playlist_title, playlist_description)
|
||||
def _extract_title(self, webpage):
|
||||
return self._html_search_regex(
|
||||
r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
|
||||
|
||||
|
||||
class NRKSkoleIE(InfoExtractor):
|
||||
|
||||
@@ -229,7 +229,14 @@ class PornHubPlaylistBaseIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
entries = self._extract_entries(webpage)
|
||||
# Only process container div with main playlist content skipping
|
||||
# drop-down menu that uses similar pattern for videos (see
|
||||
# https://github.com/rg3/youtube-dl/issues/11594).
|
||||
container = self._search_regex(
|
||||
r'(?s)(<div[^>]+class=["\']container.+)', webpage,
|
||||
'container', default=webpage)
|
||||
|
||||
entries = self._extract_entries(container)
|
||||
|
||||
playlist = self._parse_json(
|
||||
self._search_regex(
|
||||
@@ -243,12 +250,12 @@ class PornHubPlaylistBaseIE(InfoExtractor):
|
||||
class PornHubPlaylistIE(PornHubPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?pornhub\.com/playlist/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.pornhub.com/playlist/6201671',
|
||||
'url': 'http://www.pornhub.com/playlist/4667351',
|
||||
'info_dict': {
|
||||
'id': '6201671',
|
||||
'title': 'P0p4',
|
||||
'id': '4667351',
|
||||
'title': 'Nataly Hot',
|
||||
},
|
||||
'playlist_mincount': 35,
|
||||
'playlist_mincount': 2,
|
||||
}]
|
||||
|
||||
|
||||
|
||||
@@ -11,6 +11,12 @@ from ..compat import compat_urlparse
|
||||
class TuneInBaseIE(InfoExtractor):
|
||||
_API_BASE_URL = 'http://tunein.com/tuner/tune/'
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return re.findall(
|
||||
r'<iframe[^>]+src=["\'](?P<url>(?:https?://)?tunein\.com/embed/player/[pst]\d+)',
|
||||
webpage)
|
||||
|
||||
def _real_extract(self, url):
|
||||
content_id = self._match_id(url)
|
||||
|
||||
@@ -69,82 +75,83 @@ class TuneInClipIE(TuneInBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/station/.*?audioClipId\=(?P<id>\d+)'
|
||||
_API_URL_QUERY = '?tuneType=AudioClip&audioclipId=%s'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://tunein.com/station/?stationId=246119&audioClipId=816',
|
||||
'md5': '99f00d772db70efc804385c6b47f4e77',
|
||||
'info_dict': {
|
||||
'id': '816',
|
||||
'title': '32m',
|
||||
'ext': 'mp3',
|
||||
},
|
||||
_TESTS = [{
|
||||
'url': 'http://tunein.com/station/?stationId=246119&audioClipId=816',
|
||||
'md5': '99f00d772db70efc804385c6b47f4e77',
|
||||
'info_dict': {
|
||||
'id': '816',
|
||||
'title': '32m',
|
||||
'ext': 'mp3',
|
||||
},
|
||||
]
|
||||
}]
|
||||
|
||||
|
||||
class TuneInStationIE(TuneInBaseIE):
|
||||
IE_NAME = 'tunein:station'
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-s|station/.*?StationId\=)(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-s|station/.*?StationId=|embed/player/s)(?P<id>\d+)'
|
||||
_API_URL_QUERY = '?tuneType=Station&stationId=%s'
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if TuneInClipIE.suitable(url) else super(TuneInStationIE, cls).suitable(url)
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://tunein.com/radio/Jazz24-885-s34682/',
|
||||
'info_dict': {
|
||||
'id': '34682',
|
||||
'title': 'Jazz 24 on 88.5 Jazz24 - KPLU-HD2',
|
||||
'ext': 'mp3',
|
||||
'location': 'Tacoma, WA',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # live stream
|
||||
},
|
||||
_TESTS = [{
|
||||
'url': 'http://tunein.com/radio/Jazz24-885-s34682/',
|
||||
'info_dict': {
|
||||
'id': '34682',
|
||||
'title': 'Jazz 24 on 88.5 Jazz24 - KPLU-HD2',
|
||||
'ext': 'mp3',
|
||||
'location': 'Tacoma, WA',
|
||||
},
|
||||
]
|
||||
'params': {
|
||||
'skip_download': True, # live stream
|
||||
},
|
||||
}, {
|
||||
'url': 'http://tunein.com/embed/player/s6404/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class TuneInProgramIE(TuneInBaseIE):
|
||||
IE_NAME = 'tunein:program'
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-p|program/.*?ProgramId\=)(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-p|program/.*?ProgramId=|embed/player/p)(?P<id>\d+)'
|
||||
_API_URL_QUERY = '?tuneType=Program&programId=%s'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://tunein.com/radio/Jazz-24-p2506/',
|
||||
'info_dict': {
|
||||
'id': '2506',
|
||||
'title': 'Jazz 24 on 91.3 WUKY-HD3',
|
||||
'ext': 'mp3',
|
||||
'location': 'Lexington, KY',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # live stream
|
||||
},
|
||||
_TESTS = [{
|
||||
'url': 'http://tunein.com/radio/Jazz-24-p2506/',
|
||||
'info_dict': {
|
||||
'id': '2506',
|
||||
'title': 'Jazz 24 on 91.3 WUKY-HD3',
|
||||
'ext': 'mp3',
|
||||
'location': 'Lexington, KY',
|
||||
},
|
||||
]
|
||||
'params': {
|
||||
'skip_download': True, # live stream
|
||||
},
|
||||
}, {
|
||||
'url': 'http://tunein.com/embed/player/p191660/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class TuneInTopicIE(TuneInBaseIE):
|
||||
IE_NAME = 'tunein:topic'
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/topic/.*?TopicId\=(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:topic/.*?TopicId=|embed/player/t)(?P<id>\d+)'
|
||||
_API_URL_QUERY = '?tuneType=Topic&topicId=%s'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://tunein.com/topic/?TopicId=101830576',
|
||||
'md5': 'c31a39e6f988d188252eae7af0ef09c9',
|
||||
'info_dict': {
|
||||
'id': '101830576',
|
||||
'title': 'Votez pour moi du 29 octobre 2015 (29/10/15)',
|
||||
'ext': 'mp3',
|
||||
'location': 'Belgium',
|
||||
},
|
||||
_TESTS = [{
|
||||
'url': 'http://tunein.com/topic/?TopicId=101830576',
|
||||
'md5': 'c31a39e6f988d188252eae7af0ef09c9',
|
||||
'info_dict': {
|
||||
'id': '101830576',
|
||||
'title': 'Votez pour moi du 29 octobre 2015 (29/10/15)',
|
||||
'ext': 'mp3',
|
||||
'location': 'Belgium',
|
||||
},
|
||||
]
|
||||
}, {
|
||||
'url': 'http://tunein.com/embed/player/t101830576/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class TuneInShortenerIE(InfoExtractor):
|
||||
|
||||
@@ -1,262 +1,312 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import functools
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
unified_strdate,
|
||||
OnDemandPagedList,
|
||||
xpath_text,
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
NO_DEFAULT,
|
||||
orderedSet,
|
||||
parse_codecs,
|
||||
qualities,
|
||||
float_or_none,
|
||||
ExtractorError,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
class ZDFIE(InfoExtractor):
|
||||
_VALID_URL = r'(?:zdf:|zdf:video:|https?://www\.zdf\.de/ZDFmediathek(?:#)?/(.*beitrag/(?:video/)?))(?P<id>[0-9]+)(?:/[^/?]+)?(?:\?.*)?'
|
||||
class ZDFBaseIE(InfoExtractor):
|
||||
def _call_api(self, url, player, referrer, video_id):
|
||||
return self._download_json(
|
||||
url, video_id, 'Downloading JSON content',
|
||||
headers={
|
||||
'Referer': referrer,
|
||||
'Api-Auth': 'Bearer %s' % player['apiToken'],
|
||||
})
|
||||
|
||||
def _extract_player(self, webpage, video_id, fatal=True):
|
||||
return self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)data-zdfplayer-jsb=(["\'])(?P<json>{.+?})\1', webpage,
|
||||
'player JSON', default='{}' if not fatal else NO_DEFAULT,
|
||||
group='json'),
|
||||
video_id)
|
||||
|
||||
|
||||
class ZDFIE(ZDFBaseIE):
|
||||
_VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?]+)\.html'
|
||||
_QUALITIES = ('auto', 'low', 'med', 'high', 'veryhigh')
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.zdf.de/ZDFmediathek/beitrag/video/2037704/ZDFspezial---Ende-des-Machtpokers--?bc=sts;stt',
|
||||
'url': 'https://www.zdf.de/service-und-hilfe/die-neue-zdf-mediathek/zdfmediathek-trailer-100.html',
|
||||
'info_dict': {
|
||||
'id': '2037704',
|
||||
'ext': 'webm',
|
||||
'title': 'ZDFspezial - Ende des Machtpokers',
|
||||
'description': 'Union und SPD haben sich auf einen Koalitionsvertrag geeinigt. Aber was bedeutet das für die Bürger? Sehen Sie hierzu das ZDFspezial "Ende des Machtpokers - Große Koalition für Deutschland".',
|
||||
'duration': 1022,
|
||||
'uploader': 'spezial',
|
||||
'uploader_id': '225948',
|
||||
'upload_date': '20131127',
|
||||
},
|
||||
'skip': 'Videos on ZDF.de are depublicised in short order',
|
||||
'id': 'zdfmediathek-trailer-100',
|
||||
'ext': 'mp4',
|
||||
'title': 'Die neue ZDFmediathek',
|
||||
'description': 'md5:3003d36487fb9a5ea2d1ff60beb55e8d',
|
||||
'duration': 30,
|
||||
'timestamp': 1477627200,
|
||||
'upload_date': '20161028',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.zdf.de/filme/taunuskrimi/die-lebenden-und-die-toten-1---ein-taunuskrimi-100.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.zdf.de/dokumentation/planet-e/planet-e-uebersichtsseite-weitere-dokumentationen-von-planet-e-100.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
|
||||
param_groups = {}
|
||||
for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
|
||||
group_id = param_group.attrib.get(self._xpath_ns('id', 'http://www.w3.org/XML/1998/namespace'))
|
||||
params = {}
|
||||
for param in param_group:
|
||||
params[param.get('name')] = param.get('value')
|
||||
param_groups[group_id] = params
|
||||
@staticmethod
|
||||
def _extract_subtitles(src):
|
||||
subtitles = {}
|
||||
for caption in try_get(src, lambda x: x['captions'], list) or []:
|
||||
subtitle_url = caption.get('uri')
|
||||
if subtitle_url and isinstance(subtitle_url, compat_str):
|
||||
lang = caption.get('language', 'deu')
|
||||
subtitles.setdefault(lang, []).append({
|
||||
'url': subtitle_url,
|
||||
})
|
||||
return subtitles
|
||||
|
||||
def _extract_format(self, video_id, formats, format_urls, meta):
|
||||
format_url = meta.get('url')
|
||||
if not format_url or not isinstance(format_url, compat_str):
|
||||
return
|
||||
if format_url in format_urls:
|
||||
return
|
||||
format_urls.add(format_url)
|
||||
mime_type = meta.get('mimeType')
|
||||
ext = determine_ext(format_url)
|
||||
if mime_type == 'application/x-mpegURL' or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, video_id, 'mp4', m3u8_id='hls',
|
||||
entry_protocol='m3u8_native', fatal=False))
|
||||
elif mime_type == 'application/f4m+xml' or ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
update_url_query(format_url, {'hdcore': '3.7.0'}), video_id, f4m_id='hds', fatal=False))
|
||||
else:
|
||||
f = parse_codecs(meta.get('mimeCodec'))
|
||||
format_id = ['http']
|
||||
for p in (meta.get('type'), meta.get('quality')):
|
||||
if p and isinstance(p, compat_str):
|
||||
format_id.append(p)
|
||||
f.update({
|
||||
'url': format_url,
|
||||
'format_id': '-'.join(format_id),
|
||||
'format_note': meta.get('quality'),
|
||||
'language': meta.get('language'),
|
||||
'quality': qualities(self._QUALITIES)(meta.get('quality')),
|
||||
'preference': -10,
|
||||
})
|
||||
formats.append(f)
|
||||
|
||||
def _extract_entry(self, url, content, video_id):
|
||||
title = content.get('title') or content['teaserHeadline']
|
||||
|
||||
t = content['mainVideoContent']['http://zdf.de/rels/target']
|
||||
|
||||
ptmd_path = t.get('http://zdf.de/rels/streams/ptmd')
|
||||
|
||||
if not ptmd_path:
|
||||
ptmd_path = t[
|
||||
'http://zdf.de/rels/streams/ptmd-template'].replace(
|
||||
'{playerId}', 'portal')
|
||||
|
||||
ptmd = self._download_json(urljoin(url, ptmd_path), video_id)
|
||||
|
||||
formats = []
|
||||
for video in smil.findall(self._xpath_ns('.//video', namespace)):
|
||||
src = video.get('src')
|
||||
if not src:
|
||||
track_uris = set()
|
||||
for p in ptmd['priorityList']:
|
||||
formitaeten = p.get('formitaeten')
|
||||
if not isinstance(formitaeten, list):
|
||||
continue
|
||||
bitrate = float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
|
||||
group_id = video.get('paramGroup')
|
||||
param_group = param_groups[group_id]
|
||||
for proto in param_group['protocols'].split(','):
|
||||
formats.append({
|
||||
'url': '%s://%s' % (proto, param_group['host']),
|
||||
'app': param_group['app'],
|
||||
'play_path': src,
|
||||
'ext': 'flv',
|
||||
'format_id': '%s-%d' % (proto, bitrate),
|
||||
'tbr': bitrate,
|
||||
})
|
||||
for f in formitaeten:
|
||||
f_qualities = f.get('qualities')
|
||||
if not isinstance(f_qualities, list):
|
||||
continue
|
||||
for quality in f_qualities:
|
||||
tracks = try_get(quality, lambda x: x['audio']['tracks'], list)
|
||||
if not tracks:
|
||||
continue
|
||||
for track in tracks:
|
||||
self._extract_format(
|
||||
video_id, formats, track_uris, {
|
||||
'url': track.get('uri'),
|
||||
'type': f.get('type'),
|
||||
'mimeType': f.get('mimeType'),
|
||||
'quality': quality.get('quality'),
|
||||
'language': track.get('language'),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
return formats
|
||||
|
||||
def extract_from_xml_url(self, video_id, xml_url):
|
||||
doc = self._download_xml(
|
||||
xml_url, video_id,
|
||||
note='Downloading video info',
|
||||
errnote='Failed to download video info')
|
||||
|
||||
status_code = doc.find('./status/statuscode')
|
||||
if status_code is not None and status_code.text != 'ok':
|
||||
code = status_code.text
|
||||
if code == 'notVisibleAnymore':
|
||||
message = 'Video %s is not available' % video_id
|
||||
else:
|
||||
message = '%s returned error: %s' % (self.IE_NAME, code)
|
||||
raise ExtractorError(message, expected=True)
|
||||
|
||||
title = doc.find('.//information/title').text
|
||||
description = xpath_text(doc, './/information/detail', 'description')
|
||||
duration = int_or_none(xpath_text(doc, './/details/lengthSec', 'duration'))
|
||||
uploader = xpath_text(doc, './/details/originChannelTitle', 'uploader')
|
||||
uploader_id = xpath_text(doc, './/details/originChannelId', 'uploader id')
|
||||
upload_date = unified_strdate(xpath_text(doc, './/details/airtime', 'upload date'))
|
||||
subtitles = {}
|
||||
captions_url = doc.find('.//caption/url')
|
||||
if captions_url is not None:
|
||||
subtitles['de'] = [{
|
||||
'url': captions_url.text,
|
||||
'ext': 'ttml',
|
||||
}]
|
||||
|
||||
def xml_to_thumbnails(fnode):
|
||||
thumbnails = []
|
||||
for node in fnode:
|
||||
thumbnail_url = node.text
|
||||
if not thumbnail_url:
|
||||
thumbnails = []
|
||||
layouts = try_get(
|
||||
content, lambda x: x['teaserImageRef']['layouts'], dict)
|
||||
if layouts:
|
||||
for layout_key, layout_url in layouts.items():
|
||||
if not isinstance(layout_url, compat_str):
|
||||
continue
|
||||
thumbnail = {
|
||||
'url': thumbnail_url,
|
||||
'url': layout_url,
|
||||
'format_id': layout_key,
|
||||
}
|
||||
if 'key' in node.attrib:
|
||||
m = re.match('^([0-9]+)x([0-9]+)$', node.attrib['key'])
|
||||
if m:
|
||||
thumbnail['width'] = int(m.group(1))
|
||||
thumbnail['height'] = int(m.group(2))
|
||||
mobj = re.search(r'(?P<width>\d+)x(?P<height>\d+)', layout_key)
|
||||
if mobj:
|
||||
thumbnail.update({
|
||||
'width': int(mobj.group('width')),
|
||||
'height': int(mobj.group('height')),
|
||||
})
|
||||
thumbnails.append(thumbnail)
|
||||
return thumbnails
|
||||
|
||||
thumbnails = xml_to_thumbnails(doc.findall('.//teaserimages/teaserimage'))
|
||||
|
||||
format_nodes = doc.findall('.//formitaeten/formitaet')
|
||||
quality = qualities(['veryhigh', 'high', 'med', 'low'])
|
||||
|
||||
def get_quality(elem):
|
||||
return quality(xpath_text(elem, 'quality'))
|
||||
format_nodes.sort(key=get_quality)
|
||||
format_ids = []
|
||||
formats = []
|
||||
for fnode in format_nodes:
|
||||
video_url = fnode.find('url').text
|
||||
is_available = 'http://www.metafilegenerator' not in video_url
|
||||
if not is_available:
|
||||
continue
|
||||
format_id = fnode.attrib['basetype']
|
||||
quality = xpath_text(fnode, './quality', 'quality')
|
||||
format_m = re.match(r'''(?x)
|
||||
(?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
|
||||
(?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
|
||||
''', format_id)
|
||||
|
||||
ext = determine_ext(video_url, None) or format_m.group('container')
|
||||
if ext not in ('smil', 'f4m', 'm3u8'):
|
||||
format_id = format_id + '-' + quality
|
||||
if format_id in format_ids:
|
||||
continue
|
||||
|
||||
if ext == 'meta':
|
||||
continue
|
||||
elif ext == 'smil':
|
||||
formats.extend(self._extract_smil_formats(
|
||||
video_url, video_id, fatal=False))
|
||||
elif ext == 'm3u8':
|
||||
# the certificates are misconfigured (see
|
||||
# https://github.com/rg3/youtube-dl/issues/8665)
|
||||
if video_url.startswith('https://'):
|
||||
continue
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4', m3u8_id=format_id, fatal=False))
|
||||
elif ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
video_url, video_id, f4m_id=format_id, fatal=False))
|
||||
else:
|
||||
proto = format_m.group('proto').lower()
|
||||
|
||||
abr = int_or_none(xpath_text(fnode, './audioBitrate', 'abr'), 1000)
|
||||
vbr = int_or_none(xpath_text(fnode, './videoBitrate', 'vbr'), 1000)
|
||||
|
||||
width = int_or_none(xpath_text(fnode, './width', 'width'))
|
||||
height = int_or_none(xpath_text(fnode, './height', 'height'))
|
||||
|
||||
filesize = int_or_none(xpath_text(fnode, './filesize', 'filesize'))
|
||||
|
||||
format_note = ''
|
||||
if not format_note:
|
||||
format_note = None
|
||||
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': video_url,
|
||||
'ext': ext,
|
||||
'acodec': format_m.group('acodec'),
|
||||
'vcodec': format_m.group('vcodec'),
|
||||
'abr': abr,
|
||||
'vbr': vbr,
|
||||
'width': width,
|
||||
'height': height,
|
||||
'filesize': filesize,
|
||||
'format_note': format_note,
|
||||
'protocol': proto,
|
||||
'_available': is_available,
|
||||
})
|
||||
format_ids.append(format_id)
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'duration': duration,
|
||||
'description': content.get('leadParagraph') or content.get('teasertext'),
|
||||
'duration': int_or_none(t.get('duration')),
|
||||
'timestamp': unified_timestamp(content.get('editorialDate')),
|
||||
'thumbnails': thumbnails,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'upload_date': upload_date,
|
||||
'subtitles': self._extract_subtitles(ptmd),
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
def _extract_regular(self, url, player, video_id):
|
||||
content = self._call_api(player['content'], player, url, video_id)
|
||||
return self._extract_entry(player['content'], content, video_id)
|
||||
|
||||
def _extract_mobile(self, video_id):
|
||||
document = self._download_json(
|
||||
'https://zdf-cdn.live.cellular.de/mediathekV2/document/%s' % video_id,
|
||||
video_id)['document']
|
||||
|
||||
title = document['titel']
|
||||
|
||||
formats = []
|
||||
format_urls = set()
|
||||
for f in document['formitaeten']:
|
||||
self._extract_format(video_id, formats, format_urls, f)
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnails = []
|
||||
teaser_bild = document.get('teaserBild')
|
||||
if isinstance(teaser_bild, dict):
|
||||
for thumbnail_key, thumbnail in teaser_bild.items():
|
||||
thumbnail_url = try_get(
|
||||
thumbnail, lambda x: x['url'], compat_str)
|
||||
if thumbnail_url:
|
||||
thumbnails.append({
|
||||
'url': thumbnail_url,
|
||||
'id': thumbnail_key,
|
||||
'width': int_or_none(thumbnail.get('width')),
|
||||
'height': int_or_none(thumbnail.get('height')),
|
||||
})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': document.get('beschreibung'),
|
||||
'duration': int_or_none(document.get('length')),
|
||||
'timestamp': unified_timestamp(try_get(
|
||||
document, lambda x: x['meta']['editorialDate'], compat_str)),
|
||||
'thumbnails': thumbnails,
|
||||
'subtitles': self._extract_subtitles(document),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
xml_url = 'http://www.zdf.de/ZDFmediathek/xmlservice/web/beitragsDetails?ak=web&id=%s' % video_id
|
||||
return self.extract_from_xml_url(video_id, xml_url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id, fatal=False)
|
||||
if webpage:
|
||||
player = self._extract_player(webpage, url, fatal=False)
|
||||
if player:
|
||||
return self._extract_regular(url, player, video_id)
|
||||
|
||||
return self._extract_mobile(video_id)
|
||||
|
||||
|
||||
class ZDFChannelIE(InfoExtractor):
|
||||
_VALID_URL = r'(?:zdf:topic:|https?://www\.zdf\.de/ZDFmediathek(?:#)?/.*kanaluebersicht/(?:[^/]+/)?)(?P<id>[0-9]+)'
|
||||
class ZDFChannelIE(ZDFBaseIE):
|
||||
_VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.zdf.de/ZDFmediathek#/kanaluebersicht/1586442/sendung/Titanic',
|
||||
'url': 'https://www.zdf.de/sport/das-aktuelle-sportstudio',
|
||||
'info_dict': {
|
||||
'id': '1586442',
|
||||
'id': 'das-aktuelle-sportstudio',
|
||||
'title': 'das aktuelle sportstudio | ZDF',
|
||||
},
|
||||
'playlist_count': 3,
|
||||
'playlist_count': 21,
|
||||
}, {
|
||||
'url': 'http://www.zdf.de/ZDFmediathek/kanaluebersicht/aktuellste/332',
|
||||
'only_matching': True,
|
||||
'url': 'https://www.zdf.de/dokumentation/planet-e',
|
||||
'info_dict': {
|
||||
'id': 'planet-e',
|
||||
'title': 'planet e.',
|
||||
},
|
||||
'playlist_count': 4,
|
||||
}, {
|
||||
'url': 'http://www.zdf.de/ZDFmediathek/kanaluebersicht/meist-gesehen/332',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.zdf.de/ZDFmediathek/kanaluebersicht/_/1798716?bc=nrt;nrm?flash=off',
|
||||
'url': 'https://www.zdf.de/filme/taunuskrimi/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_PAGE_SIZE = 50
|
||||
|
||||
def _fetch_page(self, channel_id, page):
|
||||
offset = page * self._PAGE_SIZE
|
||||
xml_url = (
|
||||
'http://www.zdf.de/ZDFmediathek/xmlservice/web/aktuellste?ak=web&offset=%d&maxLength=%d&id=%s'
|
||||
% (offset, self._PAGE_SIZE, channel_id))
|
||||
doc = self._download_xml(
|
||||
xml_url, channel_id,
|
||||
note='Downloading channel info',
|
||||
errnote='Failed to download channel info')
|
||||
|
||||
title = doc.find('.//information/title').text
|
||||
description = doc.find('.//information/detail').text
|
||||
for asset in doc.findall('.//teasers/teaser'):
|
||||
a_type = asset.find('./type').text
|
||||
a_id = asset.find('./details/assetId').text
|
||||
if a_type not in ('video', 'topic'):
|
||||
continue
|
||||
yield {
|
||||
'_type': 'url',
|
||||
'playlist_title': title,
|
||||
'playlist_description': description,
|
||||
'url': 'zdf:%s:%s' % (a_type, a_id),
|
||||
}
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if ZDFIE.suitable(url) else super(ZDFChannelIE, cls).suitable(url)
|
||||
|
||||
def _real_extract(self, url):
|
||||
channel_id = self._match_id(url)
|
||||
entries = OnDemandPagedList(
|
||||
functools.partial(self._fetch_page, channel_id), self._PAGE_SIZE)
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'id': channel_id,
|
||||
'entries': entries,
|
||||
}
|
||||
webpage = self._download_webpage(url, channel_id)
|
||||
|
||||
entries = [
|
||||
self.url_result(item_url, ie=ZDFIE.ie_key())
|
||||
for item_url in orderedSet(re.findall(
|
||||
r'data-plusbar-url=["\'](http.+?\.html)', webpage))]
|
||||
|
||||
return self.playlist_result(
|
||||
entries, channel_id, self._og_search_title(webpage, fatal=False))
|
||||
|
||||
"""
|
||||
player = self._extract_player(webpage, channel_id)
|
||||
|
||||
channel_id = self._search_regex(
|
||||
r'docId\s*:\s*(["\'])(?P<id>(?!\1).+?)\1', webpage,
|
||||
'channel id', group='id')
|
||||
|
||||
channel = self._call_api(
|
||||
'https://api.zdf.de/content/documents/%s.json' % channel_id,
|
||||
player, url, channel_id)
|
||||
|
||||
items = []
|
||||
for module in channel['module']:
|
||||
for teaser in try_get(module, lambda x: x['teaser'], list) or []:
|
||||
t = try_get(
|
||||
teaser, lambda x: x['http://zdf.de/rels/target'], dict)
|
||||
if not t:
|
||||
continue
|
||||
items.extend(try_get(
|
||||
t,
|
||||
lambda x: x['resultsWithVideo']['http://zdf.de/rels/search/results'],
|
||||
list) or [])
|
||||
items.extend(try_get(
|
||||
module,
|
||||
lambda x: x['filterRef']['resultsWithVideo']['http://zdf.de/rels/search/results'],
|
||||
list) or [])
|
||||
|
||||
entries = []
|
||||
entry_urls = set()
|
||||
for item in items:
|
||||
t = try_get(item, lambda x: x['http://zdf.de/rels/target'], dict)
|
||||
if not t:
|
||||
continue
|
||||
sharing_url = t.get('http://zdf.de/rels/sharing-url')
|
||||
if not sharing_url or not isinstance(sharing_url, compat_str):
|
||||
continue
|
||||
if sharing_url in entry_urls:
|
||||
continue
|
||||
entry_urls.add(sharing_url)
|
||||
entries.append(self.url_result(
|
||||
sharing_url, ie=ZDFIE.ie_key(), video_id=t.get('id')))
|
||||
|
||||
return self.playlist_result(entries, channel_id, channel.get('title'))
|
||||
"""
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2016.12.31'
|
||||
__version__ = '2017.01.05'
|
||||
|
||||
Reference in New Issue
Block a user