Compare commits

..

32 Commits

Author SHA1 Message Date
Sergey M․
3a0ceb32e2 release 2018.03.10 2018-03-10 04:45:57 +07:00
Sergey M․
7dee417127 [ChangeLog] Actualize
[ci skip]
2018-03-10 04:44:46 +07:00
Sergey M․
5b1d158834 [raywenderlich] Extract videos in order 2018-03-10 04:31:51 +07:00
Eitan Postavsky
a7298f3e99 [pornhub] Don't override session cookies (closes #15697) 2018-03-09 23:57:32 +07:00
Sergey M․
5d49d879cc [raywenderlich] Add extractor (#15251) 2018-03-09 23:27:44 +07:00
Sergey M․
b5434b5c31 [nexx] Fix typo 2018-03-08 03:25:04 +07:00
Sergey M․
690404a6f8 [funk] Fix extraction and rework extractors (closes #15792) 2018-03-08 03:17:46 +07:00
Sergey M․
d91dd0ce19 [nexx] Restore reverse engineered approach 2018-03-08 03:16:21 +07:00
kayb94
6202f08e1b [heise] Add support for kaltura embeds (closes #14961) 2018-03-06 23:10:01 +07:00
Sergey M․
574e9db2b0 [tvnow] Extract series metadata (closes #15774) 2018-03-06 23:06:00 +07:00
Toni Viemerö
2e25f80d5d [ruutu] Continue formats extraction on NOT-USED URLs 2018-03-06 02:01:04 +07:00
Sergey M․
64f34528df [vrtnu] Use redirect URL for building video JSON URL (closes #15767, closes #15769) 2018-03-05 22:57:19 +07:00
Sergey M․
26ad6bcdfc [vimeo] Modernize login code and improve error messaging 2018-03-05 22:45:47 +07:00
Sergey M․
81dc74966a [archiveorg] Fix extraction (closes #15770, closes #15772) 2018-03-05 22:30:32 +07:00
Sergey M․
d53b6764d0 [hidive] Remove proxy from params 2018-03-04 23:23:30 +07:00
Sergey M․
62f49dd3b9 [hidive] Add extractor (closes #15494) 2018-03-04 17:46:36 +07:00
Sergey M․
f9f10268c1 [afreecatv] Detect deleted videos 2018-03-04 03:13:45 +07:00
Sergey M․
f241a97312 [afreecatv] Fix extraction (closes #15755) 2018-03-04 03:01:58 +07:00
Sergey M․
86c8cfc555 [vice] Fix extraction and rework extractors (closes #11101, closes #13019, closes #13622, closes #13778) 2018-03-03 23:08:43 +07:00
Sergey M․
c01db237b5 [vidzi] Add support for vidzi.si (closes #15751) 2018-03-03 20:16:55 +07:00
Sergey M․
0093c77032 [downloader/hls] Skip uplynk ad fragments (closes #15748) 2018-03-03 20:00:25 +07:00
Sergey M․
5616caf852 [npo] Fix typo 2018-03-03 01:47:09 +07:00
Sergey M․
05a7ffb126 release 2018.03.03 2018-03-03 01:37:01 +07:00
Sergey M․
28f21c9501 [ChangeLog] Actualize
[ci skip]
2018-03-03 01:32:21 +07:00
Sergey M․
4c780fbd0a [yapfiles] Add extractor (closes #15726, refs #11085) 2018-03-03 01:24:36 +07:00
Sergey M․
7773a92800 [spankbang] Fix formats extraction (closes #15727) 2018-03-02 23:39:20 +07:00
Sergey M․
b871d7e954 [utils] Add parse_resolution 2018-03-02 23:39:04 +07:00
Remita Amine
44dc11db61 [adn] fix format extraction(#15716) 2018-02-28 19:41:30 +01:00
Sergey M․
949faa15e8 [toggle] Extract DASH and ISM formats (closes #15721) 2018-02-28 22:55:09 +07:00
Sergey M․
0c3e5f4921 Revert "Respect --prefer-insecure while updating (closes #15497)"
This reverts commit 7d2b4aa047.
2018-02-27 22:30:08 +07:00
Sergey M․
266fbd6b73 [nickelodeon] Add support for nickelodeon.com.tr (closes #15706) 2018-02-26 22:10:44 +07:00
Sergey M․
d1b6187012 [npo] Validate and filter format URLs (closes #15709) 2018-02-26 21:50:51 +07:00
34 changed files with 942 additions and 253 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.02.26*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.02.26**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.03.10*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.03.10**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2018.02.26
[debug] youtube-dl version 2018.03.10
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -1,3 +1,42 @@
version 2018.03.10
Core
* [downloader/hls] Skip uplynk ad fragments (#15748)
Extractors
* [pornhub] Don't override session cookies (#15697)
+ [raywenderlich] Add support for videos.raywenderlich.com (#15251)
* [funk] Fix extraction and rework extractors (#15792)
* [nexx] Restore reverse engineered approach
+ [heise] Add support for kaltura embeds (#14961, #15728)
+ [tvnow] Extract series metadata (#15774)
* [ruutu] Continue formats extraction on NOT-USED URLs (#15775)
* [vrtnu] Use redirect URL for building video JSON URL (#15767, #15769)
* [vimeo] Modernize login code and improve error messaging
* [archiveorg] Fix extraction (#15770, #15772)
+ [hidive] Add support for hidive.com (#15494)
* [afreecatv] Detect deleted videos
* [afreecatv] Fix extraction (#15755)
* [vice] Fix extraction and rework extractors (#11101, #13019, #13622, #13778)
+ [vidzi] Add support for vidzi.si (#15751)
* [npo] Fix typo
version 2018.03.03
Core
+ [utils] Add parse_resolution
Revert respect --prefer-insecure while updating
Extractors
+ [yapfiles] Add support for yapfiles.ru (#15726, #11085)
* [spankbang] Fix formats extraction (#15727)
* [adn] Fix extraction (#15716)
+ [toggle] Extract DASH and ISM formats (#15721)
+ [nickelodeon] Add support for nickelodeon.com.tr (#15706)
* [npo] Validate and filter format URLs (#15709)
version 2018.02.26
Extractors

View File

@@ -310,7 +310,8 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--encoding ENCODING Force the specified encoding (experimental)
--no-check-certificate Suppress HTTPS certificate validation
--prefer-insecure Use an unencrypted connection to retrieve
information whenever possible
information about the video. (Currently
supported only for YouTube)
--user-agent UA Specify a custom user agent
--referer URL Specify a custom referer, use if the video
access is restricted to one domain

View File

@@ -298,7 +298,8 @@
- **freespeech.org**
- **FreshLive**
- **Funimation**
- **Funk**
- **FunkChannel**
- **FunkMix**
- **FunnyOrDie**
- **Fusion**
- **Fux**
@@ -336,6 +337,7 @@
- **HentaiStigma**
- **hetklokhuis**
- **hgtv.com:show**
- **HiDive**
- **HistoricFilms**
- **history:topic**: History.com Topic
- **hitbox**
@@ -674,6 +676,7 @@
- **RaiPlay**
- **RaiPlayLive**
- **RaiPlayPlaylist**
- **RayWenderlich**
- **RBMARadio**
- **RDS**: RDS.ca
- **RedBullTV**
@@ -934,7 +937,6 @@
- **vice**
- **vice:article**
- **vice:show**
- **Viceland**
- **Vidbit**
- **Viddler**
- **Videa**
@@ -1055,6 +1057,7 @@
- **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
- **yandexmusic:track**: Яндекс.Музыка - Трек
- **YapFiles**
- **YesJapan**
- **yinyuetai:video**: 音悦Tai
- **Ynet**

View File

@@ -53,6 +53,7 @@ from youtube_dl.utils import (
parse_filesize,
parse_count,
parse_iso8601,
parse_resolution,
pkcs1pad,
read_batch_urls,
sanitize_filename,
@@ -982,6 +983,16 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_count('1.1kk '), 1100000)
self.assertEqual(parse_count('1.1kk views'), 1100000)
def test_parse_resolution(self):
self.assertEqual(parse_resolution(None), {})
self.assertEqual(parse_resolution(''), {})
self.assertEqual(parse_resolution('1920x1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920×1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920 x 1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('720p'), {'height': 720})
self.assertEqual(parse_resolution('4k'), {'height': 2160})
self.assertEqual(parse_resolution('8K'), {'height': 4320})
def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))

View File

@@ -438,7 +438,7 @@ def _real_main(argv=None):
with YoutubeDL(ydl_opts) as ydl:
# Update version
if opts.update_self:
update_self(ydl.to_screen, opts.verbose, ydl._opener, opts.prefer_insecure)
update_self(ydl.to_screen, opts.verbose, ydl._opener)
# Remove cache dir
if opts.rm_cachedir:

View File

@@ -75,8 +75,9 @@ class HlsFD(FragmentFD):
fd.add_progress_hook(ph)
return fd.real_download(filename, info_dict)
def anvato_ad(s):
return s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
def is_ad_fragment(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s or
s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
media_frags = 0
ad_frags = 0
@@ -86,7 +87,7 @@ class HlsFD(FragmentFD):
if not line:
continue
if line.startswith('#'):
if anvato_ad(line):
if is_ad_fragment(line):
ad_frags += 1
ad_frag_next = True
continue
@@ -195,7 +196,7 @@ class HlsFD(FragmentFD):
'start': sub_range_start,
'end': sub_range_start + int(splitted_byte_range[0]),
}
elif anvato_ad(line):
elif is_ad_fragment(line):
ad_frag_next = True
self._finish_frag_download(ctx)

View File

@@ -51,7 +51,7 @@ class ADNIE(InfoExtractor):
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
bytes_to_intlist(b'\xc8\x6e\x06\xbc\xbe\xc6\x49\xf5\x88\x0d\xc8\x47\xc4\x27\x0c\x60'),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
))
subtitles_json = self._parse_json(
@@ -107,15 +107,18 @@ class ADNIE(InfoExtractor):
options = player_config.get('options') or {}
metas = options.get('metas') or {}
title = metas.get('title') or video_info['title']
links = player_config.get('links') or {}
sub_path = player_config.get('subtitles')
error = None
if not links:
links_url = player_config['linksurl']
links_url = player_config.get('linksurl') or options['videoUrl']
links_data = self._download_json(urljoin(
self._BASE_URL, links_url), video_id)
links = links_data.get('links') or {}
metas = metas or links_data.get('meta') or {}
sub_path = sub_path or links_data.get('subtitles')
error = links_data.get('error')
title = metas.get('title') or video_info['title']
formats = []
for format_id, qualities in links.items():
@@ -146,7 +149,7 @@ class ADNIE(InfoExtractor):
'description': strip_or_none(metas.get('summary') or video_info.get('resume')),
'thumbnail': video_info.get('image'),
'formats': formats,
'subtitles': self.extract_subtitles(player_config.get('subtitles'), video_id),
'subtitles': self.extract_subtitles(sub_path, video_id),
'episode': metas.get('subtitle') or video_info.get('videoTitle'),
'series': video_info.get('playlistTitle'),
}

View File

@@ -177,6 +177,10 @@ class AfreecaTVIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
if re.search(r'alert\(["\']This video has been deleted', webpage):
raise ExtractorError(
'Video %s has been deleted' % video_id, expected=True)
station_id = self._search_regex(
r'nStationNo\s*=\s*(\d+)', webpage, 'station')
bbs_id = self._search_regex(
@@ -200,10 +204,10 @@ class AfreecaTVIE(InfoExtractor):
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, flag), expected=True)
video_element = video_xml.findall(compat_xpath('./track/video'))[1]
video_element = video_xml.findall(compat_xpath('./track/video'))[-1]
if video_element is None or video_element.text is None:
raise ExtractorError('Specified AfreecaTV video does not exist',
expected=True)
raise ExtractorError(
'Video %s video does not exist' % video_id, expected=True)
video_url = video_element.text.strip()

View File

@@ -41,7 +41,7 @@ class ArchiveOrgIE(InfoExtractor):
webpage = self._download_webpage(
'http://archive.org/embed/' + video_id, video_id)
jwplayer_playlist = self._parse_json(self._search_regex(
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\);",
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)",
webpage, 'jwplayer playlist'), video_id)
info = self._parse_jwplayer_data(
{'playlist': jwplayer_playlist}, video_id, base_url=url)

View File

@@ -246,7 +246,7 @@ class VrtNUIE(GigyaBaseIE):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
webpage, urlh = self._download_webpage_handle(url, display_id)
title = self._html_search_regex(
r'(?ms)<h1 class="content__heading">(.+?)</h1>',
@@ -276,7 +276,7 @@ class VrtNUIE(GigyaBaseIE):
webpage, 'release_date', default=None))
# If there's a ? or a # in the URL, remove them and everything after
clean_url = url.split('?')[0].split('#')[0].strip('/')
clean_url = urlh.geturl().split('?')[0].split('#')[0].strip('/')
securevideo_url = clean_url + '.mssecurevideo.json'
try:

View File

@@ -385,7 +385,10 @@ from .freesound import FreesoundIE
from .freespeech import FreespeechIE
from .freshlive import FreshLiveIE
from .funimation import FunimationIE
from .funk import FunkIE
from .funk import (
FunkMixIE,
FunkChannelIE,
)
from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE
from .fxnetworks import FXNetworksIE
@@ -429,6 +432,7 @@ from .hellporno import HellPornoIE
from .helsinki import HelsinkiIE
from .hentaistigma import HentaiStigmaIE
from .hgtv import HGTVComShowIE
from .hidive import HiDiveIE
from .historicfilms import HistoricFilmsIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hitrecord import HitRecordIE
@@ -871,6 +875,7 @@ from .rai import (
RaiPlayPlaylistIE,
RaiIE,
)
from .raywenderlich import RayWenderlichIE
from .rbmaradio import RBMARadioIE
from .rds import RDSIE
from .redbulltv import RedBullTVIE
@@ -1210,7 +1215,6 @@ from .vice import (
ViceArticleIE,
ViceShowIE,
)
from .viceland import VicelandIE
from .vidbit import VidbitIE
from .viddler import ViddlerIE
from .videa import VideaIE
@@ -1369,6 +1373,7 @@ from .yandexmusic import (
YandexMusicPlaylistIE,
)
from .yandexdisk import YandexDiskIE
from .yapfiles import YapFilesIE
from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE
from .ynet import YnetIE

View File

@@ -1,43 +1,102 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .nexx import NexxIE
from ..utils import extract_attributes
from ..utils import int_or_none
class FunkIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?funk\.net/(?:mix|channel)/(?:[^/]+/)*(?P<id>[^?/#]+)'
class FunkBaseIE(InfoExtractor):
def _make_url_result(self, video):
return {
'_type': 'url_transparent',
'url': 'nexx:741:%s' % video['sourceId'],
'ie_key': NexxIE.ie_key(),
'id': video['sourceId'],
'title': video.get('title'),
'description': video.get('description'),
'duration': int_or_none(video.get('duration')),
'season_number': int_or_none(video.get('seasonNr')),
'episode_number': int_or_none(video.get('episodeNr')),
}
class FunkMixIE(FunkBaseIE):
_VALID_URL = r'https?://(?:www\.)?funk\.net/mix/(?P<id>[^/]+)/(?P<alias>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.funk.net/mix/59d65d935f8b160001828b5b/0/59d517e741dca10001252574/',
'md5': '4d40974481fa3475f8bccfd20c5361f8',
'url': 'https://www.funk.net/mix/59d65d935f8b160001828b5b/die-realste-kifferdoku-aller-zeiten',
'md5': '8edf617c2f2b7c9847dfda313f199009',
'info_dict': {
'id': '716599',
'id': '123748',
'ext': 'mp4',
'title': 'Neue Rechte Welle',
'description': 'md5:a30a53f740ffb6bfd535314c2cc5fb69',
'timestamp': 1501337639,
'upload_date': '20170729',
'title': '"Die realste Kifferdoku aller Zeiten"',
'description': 'md5:c97160f5bafa8d47ec8e2e461012aa9d',
'timestamp': 1490274721,
'upload_date': '20170323',
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
mix_id = mobj.group('id')
alias = mobj.group('alias')
lists = self._download_json(
'https://www.funk.net/api/v3.1/curation/curatedLists/',
mix_id, headers={
'authorization': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbGllbnROYW1lIjoiY3VyYXRpb24tdG9vbC12Mi4wIiwic2NvcGUiOiJzdGF0aWMtY29udGVudC1hcGksY3VyYXRpb24tc2VydmljZSxzZWFyY2gtYXBpIn0.SGCC1IXHLtZYoo8PvRKlU2gXH1su8YSu47sB3S4iXBI',
'Referer': url,
}, query={
'size': 100,
})['result']['lists']
metas = next(
l for l in lists
if mix_id in (l.get('entityId'), l.get('alias')))['videoMetas']
video = next(
meta['videoDataDelegate']
for meta in metas if meta.get('alias') == alias)
return self._make_url_result(video)
class FunkChannelIE(FunkBaseIE):
_VALID_URL = r'https?://(?:www\.)?funk\.net/channel/(?P<id>[^/]+)/(?P<alias>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.funk.net/channel/ba/die-lustigsten-instrumente-aus-dem-internet-teil-2',
'info_dict': {
'id': '1155821',
'ext': 'mp4',
'title': 'Die LUSTIGSTEN INSTRUMENTE aus dem Internet - Teil 2',
'description': 'md5:a691d0413ef4835588c5b03ded670c1f',
'timestamp': 1514507395,
'upload_date': '20171229',
},
'params': {
'format': 'bestvideo',
'skip_download': True,
},
}, {
'url': 'https://www.funk.net/channel/59d5149841dca100012511e3/0/59d52049999264000182e79d/',
'url': 'https://www.funk.net/channel/59d5149841dca100012511e3/mein-erster-job-lovemilla-folge-1/lovemilla/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
channel_id = mobj.group('id')
alias = mobj.group('alias')
webpage = self._download_webpage(url, video_id)
results = self._download_json(
'https://www.funk.net/api/v3.0/content/videos/filter', channel_id,
headers={
'authorization': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbGllbnROYW1lIjoiY3VyYXRpb24tdG9vbCIsInNjb3BlIjoic3RhdGljLWNvbnRlbnQtYXBpLGN1cmF0aW9uLWFwaSxzZWFyY2gtYXBpIn0.q4Y2xZG8PFHai24-4Pjx2gym9RmJejtmK6lMXP5wAgc',
'Referer': url,
}, query={
'channelId': channel_id,
'size': 100,
})['result']
domain_id = NexxIE._extract_domain_id(webpage) or '741'
nexx_id = extract_attributes(self._search_regex(
r'(<div[^>]id=["\']mediaplayer-funk[^>]+>)',
webpage, 'media player'))['data-id']
video = next(r for r in results if r.get('alias') == alias)
return self.url_result(
'nexx:%s:%s' % (domain_id, nexx_id), ie=NexxIE.ie_key(),
video_id=nexx_id)
return self._make_url_result(video)

View File

@@ -102,6 +102,8 @@ from .channel9 import Channel9IE
from .vshare import VShareIE
from .mediasite import MediasiteIE
from .springboardplatform import SpringboardPlatformIE
from .yapfiles import YapFilesIE
from .vice import ViceIE
class GenericIE(InfoExtractor):
@@ -1970,6 +1972,18 @@ class GenericIE(InfoExtractor):
'params': {
'skip_download': True,
},
},
{
'url': 'https://www.yapfiles.ru/show/1872528/690b05d3054d2dbe1e69523aa21bb3b1.mp4.html',
'info_dict': {
'id': 'vMDE4NzI1Mjgt690b',
'ext': 'mp4',
'title': 'Котята',
},
'add_ie': [YapFilesIE.ie_key()],
'params': {
'skip_download': True,
},
}
# {
# # TODO: find another test
@@ -2947,6 +2961,16 @@ class GenericIE(InfoExtractor):
springboardplatform_urls, video_id, video_title,
ie=SpringboardPlatformIE.ie_key())
yapfiles_urls = YapFilesIE._extract_urls(webpage)
if yapfiles_urls:
return self.playlist_from_matches(
yapfiles_urls, video_id, video_title, ie=YapFilesIE.ie_key())
vice_urls = ViceIE._extract_urls(webpage)
if vice_urls:
return self.playlist_from_matches(
vice_urls, video_id, video_title, ie=ViceIE.ie_key())
def merge_dicts(dict1, dict2):
merged = {}
for k, v in dict1.items():

View File

@@ -2,11 +2,13 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from .kaltura import KalturaIE
from .youtube import YoutubeIE
from ..utils import (
determine_ext,
int_or_none,
parse_iso8601,
smuggle_url,
xpath_text,
)
@@ -42,6 +44,19 @@ class HeiseIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.heise.de/video/artikel/nachgehakt-Wie-sichert-das-c-t-Tool-Restric-tor-Windows-10-ab-3700244.html',
'md5': '4b58058b46625bdbd841fc2804df95fc',
'info_dict': {
'id': '1_ntrmio2s',
'timestamp': 1512470717,
'upload_date': '20171205',
'ext': 'mp4',
'title': 'ct10 nachgehakt hos restrictor',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.heise.de/ct/artikel/c-t-uplink-3-3-Owncloud-Tastaturen-Peilsender-Smartphone-2403911.html',
'only_matching': True,
@@ -67,9 +82,14 @@ class HeiseIE(InfoExtractor):
if yt_urls:
return self.playlist_from_matches(yt_urls, video_id, title, ie=YoutubeIE.ie_key())
kaltura_url = KalturaIE._extract_url(webpage)
if kaltura_url:
return self.url_result(smuggle_url(kaltura_url, {'source_url': url}), KalturaIE.ie_key())
container_id = self._search_regex(
r'<div class="videoplayerjw"[^>]+data-container="([0-9]+)"',
webpage, 'container ID')
sequenz_id = self._search_regex(
r'<div class="videoplayerjw"[^>]+data-sequenz="([0-9]+)"',
webpage, 'sequenz ID')

View File

@@ -0,0 +1,96 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
urlencode_postdata,
)
class HiDiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?hidive\.com/stream/(?P<title>[^/]+)/(?P<key>[^/?#&]+)'
# Using X-Forwarded-For results in 403 HTTP error for HLS fragments,
# so disabling geo bypass completely
_GEO_BYPASS = False
_TESTS = [{
'url': 'https://www.hidive.com/stream/the-comic-artist-and-his-assistants/s01e001',
'info_dict': {
'id': 'the-comic-artist-and-his-assistants/s01e001',
'ext': 'mp4',
'title': 'the-comic-artist-and-his-assistants/s01e001',
'series': 'the-comic-artist-and-his-assistants',
'season_number': 1,
'episode_number': 1,
},
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
title, key = mobj.group('title', 'key')
video_id = '%s/%s' % (title, key)
settings = self._download_json(
'https://www.hidive.com/play/settings', video_id,
data=urlencode_postdata({
'Title': title,
'Key': key,
}))
restriction = settings.get('restrictionReason')
if restriction == 'RegionRestricted':
self.raise_geo_restricted()
if restriction and restriction != 'None':
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, restriction), expected=True)
formats = []
subtitles = {}
for rendition_id, rendition in settings['renditions'].items():
bitrates = rendition.get('bitrates')
if not isinstance(bitrates, dict):
continue
m3u8_url = bitrates.get('hls')
if not isinstance(m3u8_url, compat_str):
continue
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='%s-hls' % rendition_id, fatal=False))
cc_files = rendition.get('ccFiles')
if not isinstance(cc_files, list):
continue
for cc_file in cc_files:
if not isinstance(cc_file, list) or len(cc_file) < 3:
continue
cc_lang = cc_file[0]
cc_url = cc_file[2]
if not isinstance(cc_lang, compat_str) or not isinstance(
cc_url, compat_str):
continue
subtitles.setdefault(cc_lang, []).append({
'url': cc_url,
})
season_number = int_or_none(self._search_regex(
r's(\d+)', key, 'season number', default=None))
episode_number = int_or_none(self._search_regex(
r'e(\d+)', key, 'episode number', default=None))
return {
'id': video_id,
'title': video_id,
'subtitles': subtitles,
'formats': formats,
'series': title,
'season_number': season_number,
'episode_number': episode_number,
}

View File

@@ -1,22 +1,27 @@
# coding: utf-8
from __future__ import unicode_literals
import hashlib
import random
import re
import time
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
parse_duration,
try_get,
urlencode_postdata,
)
class NexxIE(InfoExtractor):
_VALID_URL = r'''(?x)
(?:
https?://api\.nexx(?:\.cloud|cdn\.com)/v3/\d+/videos/byid/|
nexx:(?:\d+:)?|
https?://api\.nexx(?:\.cloud|cdn\.com)/v3/(?P<domain_id>\d+)/videos/byid/|
nexx:(?:(?P<domain_id_s>\d+):)?|
https?://arc\.nexx\.cloud/api/video/
)
(?P<id>\d+)
@@ -57,6 +62,21 @@ class NexxIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
# does not work via arc
'url': 'nexx:741:1269984',
'md5': 'c714b5b238b2958dc8d5642addba6886',
'info_dict': {
'id': '1269984',
'ext': 'mp4',
'title': '1 TAG ohne KLO... wortwörtlich! 😑',
'alt_title': '1 TAG ohne KLO... wortwörtlich! 😑',
'description': 'md5:4604539793c49eda9443ab5c5b1d612f',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 607,
'timestamp': 1518614955,
'upload_date': '20180214',
},
}, {
'url': 'https://api.nexxcdn.com/v3/748/videos/byid/128907',
'only_matching': True,
@@ -103,12 +123,99 @@ class NexxIE(InfoExtractor):
def _extract_url(webpage):
return NexxIE._extract_urls(webpage)[0]
def _real_extract(self, url):
video_id = self._match_id(url)
def _handle_error(self, response):
status = int_or_none(try_get(
response, lambda x: x['metadata']['status']) or 200)
if 200 <= status < 300:
return
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, response['metadata']['errorhint']),
expected=True)
video = self._download_json(
def _call_api(self, domain_id, path, video_id, data=None, headers={}):
headers['Content-Type'] = 'application/x-www-form-urlencoded; charset=UTF-8'
result = self._download_json(
'https://api.nexx.cloud/v3/%s/%s' % (domain_id, path), video_id,
'Downloading %s JSON' % path, data=urlencode_postdata(data),
headers=headers)
self._handle_error(result)
return result['result']
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
domain_id = mobj.group('domain_id') or mobj.group('domain_id_s')
video_id = mobj.group('id')
video = None
response = self._download_json(
'https://arc.nexx.cloud/api/video/%s.json' % video_id,
video_id)['result']
video_id, fatal=False)
if response and isinstance(response, dict):
result = response.get('result')
if result and isinstance(result, dict):
video = result
# not all videos work via arc, e.g. nexx:741:1269984
if not video:
# Reverse engineered from JS code (see getDeviceID function)
device_id = '%d:%d:%d%d' % (
random.randint(1, 4), int(time.time()),
random.randint(1e4, 99999), random.randint(1, 9))
result = self._call_api(domain_id, 'session/init', video_id, data={
'nxp_devh': device_id,
'nxp_userh': '',
'precid': '0',
'playlicense': '0',
'screenx': '1920',
'screeny': '1080',
'playerversion': '6.0.00',
'gateway': 'html5',
'adGateway': '',
'explicitlanguage': 'en-US',
'addTextTemplates': '1',
'addDomainData': '1',
'addAdModel': '1',
}, headers={
'X-Request-Enable-Auth-Fallback': '1',
})
cid = result['general']['cid']
# As described in [1] X-Request-Token generation algorithm is
# as follows:
# md5( operation + domain_id + domain_secret )
# where domain_secret is a static value that will be given by nexx.tv
# as per [1]. Here is how this "secret" is generated (reversed
# from _play.api.init function, search for clienttoken). So it's
# actually not static and not that much of a secret.
# 1. https://nexxtvstorage.blob.core.windows.net/files/201610/27.pdf
secret = result['device']['clienttoken'][int(device_id[0]):]
secret = secret[0:len(secret) - int(device_id[-1])]
op = 'byid'
# Reversed from JS code for _play.api.call function (search for
# X-Request-Token)
request_token = hashlib.md5(
''.join((op, domain_id, secret)).encode('utf-8')).hexdigest()
video = self._call_api(
domain_id, 'videos/%s/%s' % (op, video_id), video_id, data={
'additionalfields': 'language,channel,actors,studio,licenseby,slug,subtitle,teaser,description',
'addInteractionOptions': '1',
'addStatusDetails': '1',
'addStreamDetails': '1',
'addCaptions': '1',
'addScenes': '1',
'addHotSpots': '1',
'addBumpers': '1',
'captionFormat': 'data',
}, headers={
'X-Request-CID': cid,
'X-Request-Token': request_token,
})
general = video['general']
title = general['title']

View File

@@ -198,7 +198,7 @@ class NickNightIE(NickDeIE):
class NickRuIE(MTVServicesInfoExtractor):
IE_NAME = 'nickelodeonru'
_VALID_URL = r'https?://(?:www\.)nickelodeon\.(?:ru|fr|es|pt|ro|hu)/[^/]+/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)nickelodeon\.(?:ru|fr|es|pt|ro|hu|com\.tr)/[^/]+/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.nickelodeon.ru/shows/henrydanger/videos/episodes/3-sezon-15-seriya-licenziya-na-polyot/pmomfb#playlist/7airc6',
'only_matching': True,
@@ -220,6 +220,9 @@ class NickRuIE(MTVServicesInfoExtractor):
}, {
'url': 'http://www.nickelodeon.hu/musorok/spongyabob-kockanadrag/videok/episodes/buborekfujas-az-elszakadt-nadrag/q57iob#playlist/k6te4y',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.com.tr/programlar/sunger-bob/videolar/kayip-yatak/mgqbjy',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -195,6 +195,10 @@ class NPOIE(NPOBaseIE):
formats = []
urls = set()
def is_legal_url(format_url):
return format_url and format_url not in urls and re.match(
r'^(?:https?:)?//', format_url)
QUALITY_LABELS = ('Laag', 'Normaal', 'Hoog')
QUALITY_FORMATS = ('adaptive', 'wmv_sb', 'h264_sb', 'wmv_bb', 'h264_bb', 'wvc1_std', 'h264_std')
@@ -208,7 +212,7 @@ class NPOIE(NPOBaseIE):
})['items'][0]
for num, item in enumerate(items):
item_url = item.get('url')
if not item_url or item_url in urls:
if not is_legal_url(item_url):
continue
urls.add(item_url)
format_id = self._search_regex(
@@ -229,7 +233,7 @@ class NPOIE(NPOBaseIE):
quality = quality_from_format_id(format_id)
f_id = format_id
else:
quality, f_id = None
quality, f_id = [None] * 2
formats.append({
'url': format_url,
'format_id': f_id,
@@ -279,7 +283,7 @@ class NPOIE(NPOBaseIE):
if not is_live:
for num, stream in enumerate(metadata.get('streams', [])):
stream_url = stream.get('url')
if not stream_url or stream_url in urls:
if not is_legal_url(stream_url):
continue
urls.add(stream_url)
# smooth streaming is not supported

View File

@@ -114,13 +114,14 @@ class PornHubIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
self._set_cookie('pornhub.com', 'age_verified', '1')
def dl_webpage(platform):
self._set_cookie('pornhub.com', 'platform', platform)
return self._download_webpage(
'http://www.pornhub.com/view_video.php?viewkey=%s' % video_id,
video_id, headers={
'Cookie': 'age_verified=1; platform=%s' % platform,
})
video_id)
webpage = dl_webpage('pc')

View File

@@ -0,0 +1,103 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .vimeo import VimeoIE
from ..utils import (
extract_attributes,
ExtractorError,
orderedSet,
smuggle_url,
unsmuggle_url,
urljoin,
)
class RayWenderlichIE(InfoExtractor):
_VALID_URL = r'https?://videos\.raywenderlich\.com/courses/(?P<course_id>[^/]+)/lessons/(?P<id>\d+)'
_TESTS = [{
'url': 'https://videos.raywenderlich.com/courses/105-testing-in-ios/lessons/1',
'info_dict': {
'id': '248377018',
'ext': 'mp4',
'title': 'Testing In iOS Episode 1: Introduction',
'duration': 133,
'uploader': 'Ray Wenderlich',
'uploader_id': 'user3304672',
},
'params': {
'noplaylist': True,
'skip_download': True,
},
'add_ie': [VimeoIE.ie_key()],
'expected_warnings': ['HTTP Error 403: Forbidden'],
}, {
'url': 'https://videos.raywenderlich.com/courses/105-testing-in-ios/lessons/1',
'info_dict': {
'title': 'Testing in iOS',
'id': '105-testing-in-ios',
},
'params': {
'noplaylist': False,
},
'playlist_count': 29,
}]
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
mobj = re.match(self._VALID_URL, url)
course_id, lesson_id = mobj.group('course_id', 'id')
video_id = '%s/%s' % (course_id, lesson_id)
webpage = self._download_webpage(url, video_id)
no_playlist = self._downloader.params.get('noplaylist')
if no_playlist or smuggled_data.get('force_video', False):
if no_playlist:
self.to_screen(
'Downloading just video %s because of --no-playlist'
% video_id)
if '>Subscribe to unlock' in webpage:
raise ExtractorError(
'This content is only available for subscribers',
expected=True)
vimeo_id = self._search_regex(
r'data-vimeo-id=["\'](\d+)', webpage, 'video id')
return self.url_result(
VimeoIE._smuggle_referrer(
'https://player.vimeo.com/video/%s' % vimeo_id, url),
ie=VimeoIE.ie_key(), video_id=vimeo_id)
self.to_screen(
'Downloading playlist %s - add --no-playlist to just download video'
% course_id)
lesson_ids = set((lesson_id, ))
for lesson in re.findall(
r'(<a[^>]+\bclass=["\']lesson-link[^>]+>)', webpage):
attrs = extract_attributes(lesson)
if not attrs:
continue
lesson_url = attrs.get('href')
if not lesson_url:
continue
lesson_id = self._search_regex(
r'/lessons/(\d+)', lesson_url, 'lesson id', default=None)
if not lesson_id:
continue
lesson_ids.add(lesson_id)
entries = []
for lesson_id in sorted(lesson_ids):
entries.append(self.url_result(
smuggle_url(urljoin(url, lesson_id), {'force_video': True}),
ie=RayWenderlichIE.ie_key()))
title = self._search_regex(
r'class=["\']course-title[^>]+>([^<]+)', webpage, 'course title',
default=None)
return self.playlist_result(entries, course_id, title)

View File

@@ -53,6 +53,12 @@ class RuutuIE(InfoExtractor):
'age_limit': 0,
},
},
# Episode where <SourceFile> is "NOT-USED", but has other
# downloadable sources available.
{
'url': 'http://www.ruutu.fi/video/3193728',
'only_matching': True,
},
]
def _real_extract(self, url):
@@ -72,7 +78,7 @@ class RuutuIE(InfoExtractor):
video_url = child.text
if (not video_url or video_url in processed_urls or
any(p in video_url for p in ('NOT_USED', 'NOT-USED'))):
return
continue
processed_urls.append(video_url)
ext = determine_ext(video_url)
if ext == 'm3u8':

View File

@@ -3,7 +3,12 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import ExtractorError
from ..utils import (
ExtractorError,
parse_duration,
parse_resolution,
str_to_int,
)
class SpankBangIE(InfoExtractor):
@@ -15,7 +20,7 @@ class SpankBangIE(InfoExtractor):
'id': '3vvn',
'ext': 'mp4',
'title': 'fantasy solo',
'description': 'Watch fantasy solo free HD porn video - 05 minutes - Babe,Masturbation,Solo,Toy - dillion harper masturbates on a bed free adult movies sexy clips.',
'description': 'dillion harper masturbates on a bed',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'silly2587',
'age_limit': 18,
@@ -32,36 +37,49 @@ class SpankBangIE(InfoExtractor):
# mobile page
'url': 'http://m.spankbang.com/1o2de/video/can+t+remember+her+name',
'only_matching': True,
}, {
# 4k
'url': 'https://spankbang.com/1vwqx/video/jade+kush+solo+4k',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage = self._download_webpage(url, video_id, headers={
'Cookie': 'country=US'
})
if re.search(r'<[^>]+\bid=["\']video_removed', webpage):
raise ExtractorError(
'Video %s is not available' % video_id, expected=True)
stream_key = self._html_search_regex(
r'''var\s+stream_key\s*=\s*['"](.+?)['"]''',
webpage, 'stream key')
formats = [{
'url': 'http://spankbang.com/_%s/%s/title/%sp__mp4' % (video_id, stream_key, height),
'ext': 'mp4',
'format_id': '%sp' % height,
'height': int(height),
} for height in re.findall(r'<(?:span|li|p)[^>]+[qb]_(\d+)p', webpage)]
self._check_formats(formats, video_id)
formats = []
for mobj in re.finditer(
r'stream_url_(?P<id>[^\s=]+)\s*=\s*(["\'])(?P<url>(?:(?!\2).)+)\2',
webpage):
format_id, format_url = mobj.group('id', 'url')
f = parse_resolution(format_id)
f.update({
'url': format_url,
'format_id': format_id,
})
formats.append(f)
self._sort_formats(formats)
title = self._html_search_regex(
r'(?s)<h1[^>]*>(.+?)</h1>', webpage, 'title')
description = self._og_search_description(webpage)
description = self._search_regex(
r'<div[^>]+\bclass=["\']bottom[^>]+>\s*<p>[^<]*</p>\s*<p>([^<]+)',
webpage, 'description', fatal=False)
thumbnail = self._og_search_thumbnail(webpage)
uploader = self._search_regex(
r'class="user"[^>]*><img[^>]+>([^<]+)',
webpage, 'uploader', default=None)
duration = parse_duration(self._search_regex(
r'<div[^>]+\bclass=["\']right_side[^>]+>\s*<span>([^<]+)',
webpage, 'duration', fatal=False))
view_count = str_to_int(self._search_regex(
r'([\d,.]+)\s+plays', webpage, 'view count', fatal=False))
age_limit = self._rta_search(webpage)
@@ -71,6 +89,8 @@ class SpankBangIE(InfoExtractor):
'description': description,
'thumbnail': thumbnail,
'uploader': uploader,
'duration': duration,
'view_count': view_count,
'formats': formats,
'age_limit': age_limit,
}

View File

@@ -132,7 +132,7 @@ class ToggleIE(InfoExtractor):
formats = []
for video_file in info.get('Files', []):
video_url, vid_format = video_file.get('URL'), video_file.get('Format')
if not video_url or not vid_format:
if not video_url or video_url == 'NA' or not vid_format:
continue
ext = determine_ext(video_url)
vid_format = vid_format.replace(' ', '')
@@ -143,6 +143,18 @@ class ToggleIE(InfoExtractor):
note='Downloading %s m3u8 information' % vid_format,
errnote='Failed to download %s m3u8 information' % vid_format,
fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id=vid_format,
note='Downloading %s MPD manifest' % vid_format,
errnote='Failed to download %s MPD manifest' % vid_format,
fatal=False))
elif ext == 'ism':
formats.extend(self._extract_ism_formats(
video_url, video_id, ism_id=vid_format,
note='Downloading %s ISM manifest' % vid_format,
errnote='Failed to download %s ISM manifest' % vid_format,
fatal=False))
elif ext in ('mp4', 'wvm'):
# wvm are drm-protected files
formats.append({

View File

@@ -7,6 +7,7 @@ from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
parse_iso8601,
parse_duration,
update_url_query,
@@ -16,8 +17,9 @@ from ..utils import (
class TVNowBaseIE(InfoExtractor):
_VIDEO_FIELDS = (
'id', 'title', 'free', 'geoblocked', 'articleLong', 'articleShort',
'broadcastStartDate', 'isDrm', 'duration', 'manifest.dashclear',
'format.defaultImage169Format', 'format.defaultImage169Logo')
'broadcastStartDate', 'isDrm', 'duration', 'season', 'episode',
'manifest.dashclear', 'format.title', 'format.defaultImage169Format',
'format.defaultImage169Logo')
def _call_api(self, path, video_id, query):
return self._download_json(
@@ -66,6 +68,10 @@ class TVNowBaseIE(InfoExtractor):
'thumbnail': thumbnail,
'timestamp': timestamp,
'duration': duration,
'series': f.get('title'),
'season_number': int_or_none(info.get('season')),
'episode_number': int_or_none(info.get('episode')),
'episode': title,
'formats': formats,
}
@@ -74,18 +80,21 @@ class TVNowIE(TVNowBaseIE):
_VALID_URL = r'https?://(?:www\.)?tvnow\.(?:de|at|ch)/(?:rtl(?:2|plus)?|nitro|superrtl|ntv|vox)/(?P<show_id>[^/]+)/(?:(?:list/[^/]+|jahr/\d{4}/\d{1,2})/)?(?P<id>[^/]+)/(?:player|preview)'
_TESTS = [{
# rtl
'url': 'https://www.tvnow.de/rtl/alarm-fuer-cobra-11/freier-fall/player?return=/rtl',
'url': 'https://www.tvnow.de/rtl2/grip-das-motormagazin/der-neue-porsche-911-gt-3/player',
'info_dict': {
'id': '385314',
'display_id': 'alarm-fuer-cobra-11/freier-fall',
'id': '331082',
'display_id': 'grip-das-motormagazin/der-neue-porsche-911-gt-3',
'ext': 'mp4',
'title': 'Freier Fall',
'description': 'md5:8c2d8f727261adf7e0dc18366124ca02',
'title': 'Der neue Porsche 911 GT 3',
'description': 'md5:6143220c661f9b0aae73b245e5d898bb',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1512677700,
'upload_date': '20171207',
'duration': 2862.0,
'timestamp': 1495994400,
'upload_date': '20170528',
'duration': 5283,
'series': 'GRIP - Das Motormagazin',
'season_number': 14,
'episode_number': 405,
'episode': 'Der neue Porsche 911 GT 3',
},
}, {
# rtl2

View File

@@ -5,113 +5,52 @@ import re
import time
import hashlib
import json
import random
from .adobepass import AdobePassIE
from .youtube import YoutubeIE
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
ExtractorError,
int_or_none,
parse_age_limit,
str_or_none,
parse_duration,
ExtractorError,
extract_attributes,
try_get,
)
class ViceBaseIE(AdobePassIE):
def _extract_preplay_video(self, url, locale, webpage):
watch_hub_data = extract_attributes(self._search_regex(
r'(?s)(<watch-hub\s*.+?</watch-hub>)', webpage, 'watch hub'))
video_id = watch_hub_data['vms-id']
title = watch_hub_data['video-title']
query = {}
is_locked = watch_hub_data.get('video-locked') == '1'
if is_locked:
resource = self._get_mvpd_resource(
'VICELAND', title, video_id,
watch_hub_data.get('video-rating'))
query['tvetoken'] = self._extract_mvpd_auth(
url, video_id, 'VICELAND', resource)
# signature generation algorithm is reverse engineered from signatureGenerator in
# webpack:///../shared/~/vice-player/dist/js/vice-player.js in
# https://www.viceland.com/assets/common/js/web.vendor.bundle.js
exp = int(time.time()) + 14400
query.update({
'exp': exp,
'sign': hashlib.sha512(('%s:GET:%d' % (video_id, exp)).encode()).hexdigest(),
})
try:
host = 'www.viceland' if is_locked else self._PREPLAY_HOST
preplay = self._download_json(
'https://%s.com/%s/preplay/%s' % (host, locale, video_id),
video_id, query=query)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
error = json.loads(e.cause.read().decode())
raise ExtractorError('%s said: %s' % (
self.IE_NAME, error['details']), expected=True)
raise
video_data = preplay['video']
base = video_data['base']
uplynk_preplay_url = preplay['preplayURL']
episode = video_data.get('episode', {})
channel = video_data.get('channel', {})
subtitles = {}
cc_url = preplay.get('ccURL')
if cc_url:
subtitles['en'] = [{
'url': cc_url,
}]
return {
'_type': 'url_transparent',
'url': uplynk_preplay_url,
'id': video_id,
'title': title,
'description': base.get('body') or base.get('display_body'),
'thumbnail': watch_hub_data.get('cover-image') or watch_hub_data.get('thumbnail'),
'duration': int_or_none(video_data.get('video_duration')) or parse_duration(watch_hub_data.get('video-duration')),
'timestamp': int_or_none(video_data.get('created_at'), 1000),
'age_limit': parse_age_limit(video_data.get('video_rating')),
'series': video_data.get('show_title') or watch_hub_data.get('show-title'),
'episode_number': int_or_none(episode.get('episode_number') or watch_hub_data.get('episode')),
'episode_id': str_or_none(episode.get('id') or video_data.get('episode_id')),
'season_number': int_or_none(watch_hub_data.get('season')),
'season_id': str_or_none(episode.get('season_id')),
'uploader': channel.get('base', {}).get('title') or watch_hub_data.get('channel-title'),
'uploader_id': str_or_none(channel.get('id')),
'subtitles': subtitles,
'ie_key': 'UplynkPreplay',
}
class ViceIE(ViceBaseIE):
class ViceIE(AdobePassIE):
IE_NAME = 'vice'
_VALID_URL = r'https?://(?:.+?\.)?vice\.com/(?:(?P<locale>[^/]+)/)?videos?/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:(?:video|vms)\.vice|(?:www\.)?viceland)\.com/(?P<locale>[^/]+)/(?:video/[^/]+|embed)/(?P<id>[\da-f]+)'
_TESTS = [{
'url': 'https://news.vice.com/video/experimenting-on-animals-inside-the-monkey-lab',
'md5': '7d3ae2f9ba5f196cdd9f9efd43657ac2',
'url': 'https://video.vice.com/en_us/video/pet-cremator/58c69e38a55424f1227dc3f7',
'info_dict': {
'id': 'N2bzkydjraWDGwnt8jAttCF6Y0PDv4Zj',
'ext': 'flv',
'title': 'Monkey Labs of Holland',
'description': 'md5:92b3c7dcbfe477f772dd4afa496c9149',
'id': '5e647f0125e145c9aef2069412c0cbde',
'ext': 'mp4',
'title': '10 Questions You Always Wanted To Ask: Pet Cremator',
'description': 'md5:fe856caacf61fe0e74fab15ce2b07ca5',
'uploader': 'vice',
'uploader_id': '57a204088cb727dec794c67b',
'timestamp': 1489664942,
'upload_date': '20170316',
'age_limit': 14,
},
'add_ie': ['Ooyala'],
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['UplynkPreplay'],
}, {
# geo restricted to US
'url': 'https://video.vice.com/en_us/video/the-signal-from-tolva/5816510690b70e6c5fd39a56',
'info_dict': {
'id': '5816510690b70e6c5fd39a56',
'id': '930c0ad1f47141cc955087eecaddb0e2',
'ext': 'mp4',
'uploader': 'Waypoint',
'uploader': 'waypoint',
'title': 'The Signal From Tölva',
'description': 'md5:3927e3c79f9e8094606a2b3c5b5e55d5',
'uploader_id': '57f7d621e05ca860fa9ccaf9',
@@ -139,27 +78,131 @@ class ViceIE(ViceBaseIE):
'params': {
# AES-encrypted m3u8
'skip_download': True,
'proxy': '127.0.0.1:8118',
},
'add_ie': ['UplynkPreplay'],
}, {
'url': 'https://video.vice.com/en_us/video/pizza-show-trailer/56d8c9a54d286ed92f7f30e4',
'only_matching': True,
}, {
'url': 'https://video.vice.com/en_us/embed/57f41d3556a0a80f54726060',
'only_matching': True,
}, {
'url': 'https://vms.vice.com/en_us/video/preplay/58c69e38a55424f1227dc3f7',
'only_matching': True,
}, {
'url': 'https://www.viceland.com/en_us/video/thursday-march-1-2018/5a8f2d7ff1cdb332dd446ec1',
'only_matching': True,
}]
_PREPLAY_HOST = 'video.vice'
_PREPLAY_HOST = 'vms.vice'
@staticmethod
def _extract_urls(webpage):
return re.findall(
r'<iframe\b[^>]+\bsrc=["\']((?:https?:)?//video\.vice\.com/[^/]+/embed/[\da-f]+)',
webpage)
@staticmethod
def _extract_url(webpage):
urls = ViceIE._extract_urls(webpage)
return urls[0] if urls else None
def _real_extract(self, url):
locale, video_id = re.match(self._VALID_URL, url).groups()
webpage, urlh = self._download_webpage_handle(url, video_id)
embed_code = self._search_regex(
r'embedCode=([^&\'"]+)', webpage,
'ooyala embed code', default=None)
if embed_code:
return self.url_result('ooyala:%s' % embed_code, 'Ooyala')
youtube_id = self._search_regex(
r'data-youtube-id="([^"]+)"', webpage, 'youtube id', default=None)
if youtube_id:
return self.url_result(youtube_id, 'Youtube')
return self._extract_preplay_video(urlh.geturl(), locale, webpage)
webpage = self._download_webpage(
'https://video.vice.com/%s/embed/%s' % (locale, video_id),
video_id)
video = self._parse_json(
self._search_regex(
r'PREFETCH_DATA\s*=\s*({.+?})\s*;\s*\n', webpage,
'app state'), video_id)['video']
video_id = video.get('vms_id') or video.get('id') or video_id
title = video['title']
is_locked = video.get('locked')
rating = video.get('rating')
thumbnail = video.get('thumbnail_url')
duration = int_or_none(video.get('duration'))
series = try_get(
video, lambda x: x['episode']['season']['show']['title'],
compat_str)
episode_number = try_get(
video, lambda x: x['episode']['episode_number'])
season_number = try_get(
video, lambda x: x['episode']['season']['season_number'])
uploader = None
query = {}
if is_locked:
resource = self._get_mvpd_resource(
'VICELAND', title, video_id, rating)
query['tvetoken'] = self._extract_mvpd_auth(
url, video_id, 'VICELAND', resource)
# signature generation algorithm is reverse engineered from signatureGenerator in
# webpack:///../shared/~/vice-player/dist/js/vice-player.js in
# https://www.viceland.com/assets/common/js/web.vendor.bundle.js
# new JS is located here https://vice-web-statics-cdn.vice.com/vice-player/player-embed.js
exp = int(time.time()) + 1440
query.update({
'exp': exp,
'sign': hashlib.sha512(('%s:GET:%d' % (video_id, exp)).encode()).hexdigest(),
'_ad_blocked': None,
'_ad_unit': '',
'_debug': '',
'platform': 'desktop',
'rn': random.randint(10000, 100000),
'fbprebidtoken': '',
})
try:
host = 'www.viceland' if is_locked else self._PREPLAY_HOST
preplay = self._download_json(
'https://%s.com/%s/video/preplay/%s' % (host, locale, video_id),
video_id, query=query)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 401):
error = json.loads(e.cause.read().decode())
error_message = error.get('error_description') or error['details']
raise ExtractorError('%s said: %s' % (
self.IE_NAME, error_message), expected=True)
raise
video_data = preplay['video']
base = video_data['base']
uplynk_preplay_url = preplay['preplayURL']
episode = video_data.get('episode', {})
channel = video_data.get('channel', {})
subtitles = {}
cc_url = preplay.get('ccURL')
if cc_url:
subtitles['en'] = [{
'url': cc_url,
}]
return {
'_type': 'url_transparent',
'url': uplynk_preplay_url,
'id': video_id,
'title': title,
'description': base.get('body') or base.get('display_body'),
'thumbnail': thumbnail,
'duration': int_or_none(video_data.get('video_duration')) or duration,
'timestamp': int_or_none(video_data.get('created_at'), 1000),
'age_limit': parse_age_limit(video_data.get('video_rating')),
'series': video_data.get('show_title') or series,
'episode_number': int_or_none(episode.get('episode_number') or episode_number),
'episode_id': str_or_none(episode.get('id') or video_data.get('episode_id')),
'season_number': int_or_none(season_number),
'season_id': str_or_none(episode.get('season_id')),
'uploader': channel.get('base', {}).get('title') or channel.get('name') or uploader,
'uploader_id': str_or_none(channel.get('id')),
'subtitles': subtitles,
'ie_key': 'UplynkPreplay',
}
class ViceShowIE(InfoExtractor):
@@ -203,14 +246,15 @@ class ViceArticleIE(InfoExtractor):
_TESTS = [{
'url': 'https://www.vice.com/en_us/article/on-set-with-the-woman-making-mormon-porn-in-utah',
'info_dict': {
'id': '58dc0a3dee202d2a0ccfcbd8',
'id': '41eae2a47b174a1398357cec55f1f6fc',
'ext': 'mp4',
'title': 'Mormon War on Porn ',
'description': 'md5:ad396a2481e7f8afb5ed486878421090',
'uploader': 'VICE',
'uploader_id': '57a204088cb727dec794c693',
'timestamp': 1489160690,
'upload_date': '20170310',
'description': 'md5:6394a8398506581d0346b9ab89093fef',
'uploader': 'vice',
'uploader_id': '57a204088cb727dec794c67b',
'timestamp': 1491883129,
'upload_date': '20170411',
'age_limit': 17,
},
'params': {
# AES-encrypted m3u8
@@ -219,17 +263,35 @@ class ViceArticleIE(InfoExtractor):
'add_ie': ['UplynkPreplay'],
}, {
'url': 'https://www.vice.com/en_us/article/how-to-hack-a-car',
'md5': 'a7ecf64ee4fa19b916c16f4b56184ae2',
'md5': '7fe8ebc4fa3323efafc127b82bd821d9',
'info_dict': {
'id': '3jstaBeXgAs',
'ext': 'mp4',
'title': 'How to Hack a Car: Phreaked Out (Episode 2)',
'description': 'md5:ee95453f7ff495db8efe14ae8bf56f30',
'uploader_id': 'MotherboardTV',
'uploader': 'Motherboard',
'uploader_id': 'MotherboardTV',
'upload_date': '20140529',
},
'add_ie': ['Youtube'],
}, {
'url': 'https://www.vice.com/en_us/article/znm9dx/karley-sciortino-slutever-reloaded',
'md5': 'a7ecf64ee4fa19b916c16f4b56184ae2',
'info_dict': {
'id': 'e2ed435eb67e43efb66e6ef9a6930a88',
'ext': 'mp4',
'title': "Making The World's First Male Sex Doll",
'description': 'md5:916078ef0e032d76343116208b6cc2c4',
'uploader': 'vice',
'uploader_id': '57a204088cb727dec794c67b',
'timestamp': 1476919911,
'upload_date': '20161019',
'age_limit': 17,
},
'params': {
'skip_download': True,
},
'add_ie': [ViceIE.ie_key()],
}, {
'url': 'https://www.vice.com/en_us/article/cowboy-capitalists-part-1',
'only_matching': True,
@@ -244,8 +306,8 @@ class ViceArticleIE(InfoExtractor):
webpage = self._download_webpage(url, display_id)
prefetch_data = self._parse_json(self._search_regex(
r'window\.__PREFETCH_DATA\s*=\s*({.*});',
webpage, 'prefetch data'), display_id)
r'__APP_STATE\s*=\s*({.+?})(?:\s*\|\|\s*{}\s*)?;\s*\n',
webpage, 'app state'), display_id)['pageData']
body = prefetch_data['body']
def _url_res(video_url, ie_key):
@@ -256,6 +318,10 @@ class ViceArticleIE(InfoExtractor):
'ie_key': ie_key,
}
vice_url = ViceIE._extract_url(webpage)
if vice_url:
return _url_res(vice_url, ViceIE.ie_key())
embed_code = self._search_regex(
r'embedCode=([^&\'"]+)', body,
'ooyala embed code', default=None)

View File

@@ -1,38 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .vice import ViceBaseIE
class VicelandIE(ViceBaseIE):
_VALID_URL = r'https?://(?:www\.)?viceland\.com/(?P<locale>[^/]+)/video/[^/]+/(?P<id>[a-f0-9]+)'
_TEST = {
'url': 'https://www.viceland.com/en_us/video/trapped/588a70d0dba8a16007de7316',
'info_dict': {
'id': '588a70d0dba8a16007de7316',
'ext': 'mp4',
'title': 'TRAPPED (Series Trailer)',
'description': 'md5:7a8e95c2b6cd86461502a2845e581ccf',
'age_limit': 14,
'timestamp': 1485474122,
'upload_date': '20170126',
'uploader_id': '57a204098cb727dec794c6a3',
'uploader': 'Viceland',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['UplynkPreplay'],
'skip': '404',
}
_PREPLAY_HOST = 'www.viceland'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
locale = mobj.group('locale')
webpage = self._download_webpage(url, video_id)
return self._extract_preplay_video(url, locale, webpage)

View File

@@ -13,7 +13,7 @@ from ..utils import (
class VidziIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vidzi\.(?:tv|cc)/(?:embed-)?(?P<id>[0-9a-zA-Z]+)'
_VALID_URL = r'https?://(?:www\.)?vidzi\.(?:tv|cc|si)/(?:embed-)?(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'http://vidzi.tv/cghql9yq6emu.html',
'md5': '4f16c71ca0c8c8635ab6932b5f3f1660',
@@ -32,6 +32,9 @@ class VidziIE(InfoExtractor):
}, {
'url': 'http://vidzi.cc/cghql9yq6emu.html',
'only_matching': True,
}, {
'url': 'https://vidzi.si/rph9gztxj1et.html',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -41,21 +41,30 @@ class VimeoBaseInfoExtractor(InfoExtractor):
if self._LOGIN_REQUIRED:
raise ExtractorError('No login info available, needed for using %s.' % self.IE_NAME, expected=True)
return
self.report_login()
webpage = self._download_webpage(self._LOGIN_URL, None, False)
webpage = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
token, vuid = self._extract_xsrft_and_vuid(webpage)
data = urlencode_postdata({
data = {
'action': 'login',
'email': username,
'password': password,
'service': 'vimeo',
'token': token,
})
login_request = sanitized_Request(self._LOGIN_URL, data)
login_request.add_header('Content-Type', 'application/x-www-form-urlencoded')
login_request.add_header('Referer', self._LOGIN_URL)
}
self._set_vimeo_cookie('vuid', vuid)
self._download_webpage(login_request, None, False, 'Wrong login info')
try:
self._download_webpage(
self._LOGIN_URL, None, 'Logging in',
data=urlencode_postdata(data), headers={
'Content-Type': 'application/x-www-form-urlencoded',
'Referer': self._LOGIN_URL,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 418:
raise ExtractorError(
'Unable to log in: bad username or password',
expected=True)
raise ExtractorError('Unable to log in')
def _verify_video_password(self, url, video_id, webpage):
password = self._downloader.params.get('videopassword')

View File

@@ -0,0 +1,101 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
qualities,
unescapeHTML,
)
class YapFilesIE(InfoExtractor):
_YAPFILES_URL = r'//(?:(?:www|api)\.)?yapfiles\.ru/get_player/*\?.*?\bv=(?P<id>\w+)'
_VALID_URL = r'https?:%s' % _YAPFILES_URL
_TESTS = [{
# with hd
'url': 'http://www.yapfiles.ru/get_player/?v=vMDE1NjcyNDUt0413',
'md5': '2db19e2bfa2450568868548a1aa1956c',
'info_dict': {
'id': 'vMDE1NjcyNDUt0413',
'ext': 'mp4',
'title': 'Самый худший пароль WIFI',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 72,
},
}, {
# without hd
'url': 'https://api.yapfiles.ru/get_player/?uid=video_player_1872528&plroll=1&adv=1&v=vMDE4NzI1Mjgt690b',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return [unescapeHTML(mobj.group('url')) for mobj in re.finditer(
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?%s.*?)\1'
% YapFilesIE._YAPFILES_URL, webpage)]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, fatal=False)
player_url = None
query = {}
if webpage:
player_url = self._search_regex(
r'player\.init\s*\(\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
'player url', default=None, group='url')
if not player_url:
player_url = 'http://api.yapfiles.ru/load/%s/' % video_id
query = {
'md5': 'ded5f369be61b8ae5f88e2eeb2f3caff',
'type': 'json',
'ref': url,
}
player = self._download_json(
player_url, video_id, query=query)['player']
playlist_url = player['playlist']
title = player['title']
thumbnail = player.get('poster')
if title == 'Ролик удален' or 'deleted.jpg' in (thumbnail or ''):
raise ExtractorError(
'Video %s has been removed' % video_id, expected=True)
playlist = self._download_json(
playlist_url, video_id)['player']['main']
hd_height = int_or_none(player.get('hd'))
QUALITIES = ('sd', 'hd')
quality_key = qualities(QUALITIES)
formats = []
for format_id in QUALITIES:
is_hd = format_id == 'hd'
format_url = playlist.get(
'file%s' % ('_hd' if is_hd else ''))
if not format_url or not isinstance(format_url, compat_str):
continue
formats.append({
'url': format_url,
'format_id': format_id,
'quality': quality_key(format_id),
'height': hd_height if is_hd else None,
})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': int_or_none(player.get('length')),
'formats': formats,
}

View File

@@ -534,7 +534,7 @@ def parseOpts(overrideArguments=None):
workarounds.add_option(
'--prefer-insecure',
'--prefer-unsecure', action='store_true', dest='prefer_insecure',
help='Use an unencrypted connection to retrieve information whenever possible')
help='Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)')
workarounds.add_option(
'--user-agent',
metavar='UA', dest='user_agent',

View File

@@ -28,10 +28,10 @@ def rsa_verify(message, signature, key):
return expected == signature
def update_self(to_screen, verbose, opener, prefer_insecure=False):
def update_self(to_screen, verbose, opener):
"""Update the program file with the latest version from the repository"""
UPDATE_URL = '//rg3.github.io/youtube-dl/update/'
UPDATE_URL = 'https://rg3.github.io/youtube-dl/update/'
VERSION_URL = UPDATE_URL + 'LATEST_VERSION'
JSON_URL = UPDATE_URL + 'versions.json'
UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
@@ -40,13 +40,9 @@ def update_self(to_screen, verbose, opener, prefer_insecure=False):
to_screen('It looks like you installed youtube-dl with a package manager, pip, setup.py or a tarball. Please use that to update.')
return
def guess_scheme(url, insecure=False):
return 'http%s:%s' % ('' if insecure is True else 's', url)
# Check if there is a new version
try:
newversion = opener.open(guess_scheme(
VERSION_URL, prefer_insecure)).read().decode('utf-8').strip()
newversion = opener.open(VERSION_URL).read().decode('utf-8').strip()
except Exception:
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
@@ -58,8 +54,7 @@ def update_self(to_screen, verbose, opener, prefer_insecure=False):
# Download and check versions info
try:
versions_info = opener.open(guess_scheme(
JSON_URL, prefer_insecure)).read().decode('utf-8')
versions_info = opener.open(JSON_URL).read().decode('utf-8')
versions_info = json.loads(versions_info)
except Exception:
if verbose:

View File

@@ -1689,6 +1689,28 @@ def parse_count(s):
return lookup_unit_table(_UNIT_TABLE, s)
def parse_resolution(s):
if s is None:
return {}
mobj = re.search(r'\b(?P<w>\d+)\s*[xX×]\s*(?P<h>\d+)\b', s)
if mobj:
return {
'width': int(mobj.group('w')),
'height': int(mobj.group('h')),
}
mobj = re.search(r'\b(\d+)[pPiI]\b', s)
if mobj:
return {'height': int(mobj.group(1))}
mobj = re.search(r'\b([48])[kK]\b', s)
if mobj:
return {'height': int(mobj.group(1)) * 540}
return {}
def month_by_name(name, lang='en'):
""" Return the number of a month by (locale-independently) English name """

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2018.02.26'
__version__ = '2018.03.10'