Compare commits
14 Commits
2018.12.31
...
2019.01.02
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d7c3af7a72 | ||
|
|
aeb72b3a41 | ||
|
|
2122d7151d | ||
|
|
751e051557 | ||
|
|
d226c560a6 | ||
|
|
8437f5089f | ||
|
|
1d803085d7 | ||
|
|
696f4e4114 | ||
|
|
0e713dbb11 | ||
|
|
9b5c8751ee | ||
|
|
d9f1123c08 | ||
|
|
3d8eb6beb9 | ||
|
|
38d15ba7f9 | ||
|
|
6b688b8942 |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.12.31*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.12.31**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.01.02*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2019.01.02**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2018.12.31
|
||||
[debug] youtube-dl version 2019.01.02
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
@@ -261,11 +261,33 @@ title = meta.get('title') or self._og_search_title(webpage)
|
||||
|
||||
This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
|
||||
|
||||
### Make regular expressions flexible
|
||||
### Regular expressions
|
||||
|
||||
When using regular expressions try to write them fuzzy and flexible.
|
||||
#### Don't capture groups you don't use
|
||||
|
||||
Capturing group must be an indication that it's used somewhere in the code. Any group that is not used must be non capturing.
|
||||
|
||||
##### Example
|
||||
|
||||
Don't capture id attribute name here since you can't use it for anything anyway.
|
||||
|
||||
Correct:
|
||||
|
||||
```python
|
||||
r'(?:id|ID)=(?P<id>\d+)'
|
||||
```
|
||||
|
||||
Incorrect:
|
||||
```python
|
||||
r'(id|ID)=(?P<id>\d+)'
|
||||
```
|
||||
|
||||
|
||||
#### Make regular expressions relaxed and flexible
|
||||
|
||||
When using regular expressions try to write them fuzzy, relaxed and flexible, skipping insignificant parts that are more likely to change, allowing both single and double quotes for quoted values and so on.
|
||||
|
||||
#### Example
|
||||
##### Example
|
||||
|
||||
Say you need to extract `title` from the following HTML code:
|
||||
|
||||
@@ -298,6 +320,25 @@ title = self._search_regex(
|
||||
webpage, 'title', group='title')
|
||||
```
|
||||
|
||||
### Long lines policy
|
||||
|
||||
There is a soft limit to keep lines of code under 80 characters long. This means it should be respected if possible and if it does not make readability and code maintenance worse.
|
||||
|
||||
For example, you should **never** split long string literals like URLs or some other often copied entities over multiple lines to fit this limit:
|
||||
|
||||
Correct:
|
||||
|
||||
```python
|
||||
'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
|
||||
```
|
||||
|
||||
Incorrect:
|
||||
|
||||
```python
|
||||
'https://www.youtube.com/watch?v=FqZTN594JQw&list='
|
||||
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
|
||||
```
|
||||
|
||||
### Use safe conversion functions
|
||||
|
||||
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
|
||||
|
||||
15
ChangeLog
15
ChangeLog
@@ -1,3 +1,18 @@
|
||||
version 2019.01.02
|
||||
|
||||
Extractors
|
||||
* [discovery] Use geo verification headers (#17838)
|
||||
+ [packtpub] Add support for subscription.packtpub.com (#18718)
|
||||
* [yourporn] Fix extraction (#18583)
|
||||
+ [acast:channel] Add support for play.acast.com (#18587)
|
||||
+ [extractors] Add missing age limits (#18621)
|
||||
+ [rmcdecouverte] Add support for live stream
|
||||
* [rmcdecouverte] Bypass geo restriction
|
||||
* [rmcdecouverte] Update URL regular expression (#18595, 18697)
|
||||
* [manyvids] Fix extraction (#18604, #18614)
|
||||
* [bitchute] Fix extraction (#18567)
|
||||
|
||||
|
||||
version 2018.12.31
|
||||
|
||||
Extractors
|
||||
|
||||
47
README.md
47
README.md
@@ -1133,11 +1133,33 @@ title = meta.get('title') or self._og_search_title(webpage)
|
||||
|
||||
This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
|
||||
|
||||
### Make regular expressions flexible
|
||||
### Regular expressions
|
||||
|
||||
When using regular expressions try to write them fuzzy and flexible.
|
||||
#### Don't capture groups you don't use
|
||||
|
||||
Capturing group must be an indication that it's used somewhere in the code. Any group that is not used must be non capturing.
|
||||
|
||||
##### Example
|
||||
|
||||
Don't capture id attribute name here since you can't use it for anything anyway.
|
||||
|
||||
Correct:
|
||||
|
||||
```python
|
||||
r'(?:id|ID)=(?P<id>\d+)'
|
||||
```
|
||||
|
||||
Incorrect:
|
||||
```python
|
||||
r'(id|ID)=(?P<id>\d+)'
|
||||
```
|
||||
|
||||
|
||||
#### Make regular expressions relaxed and flexible
|
||||
|
||||
When using regular expressions try to write them fuzzy, relaxed and flexible, skipping insignificant parts that are more likely to change, allowing both single and double quotes for quoted values and so on.
|
||||
|
||||
#### Example
|
||||
##### Example
|
||||
|
||||
Say you need to extract `title` from the following HTML code:
|
||||
|
||||
@@ -1170,6 +1192,25 @@ title = self._search_regex(
|
||||
webpage, 'title', group='title')
|
||||
```
|
||||
|
||||
### Long lines policy
|
||||
|
||||
There is a soft limit to keep lines of code under 80 characters long. This means it should be respected if possible and if it does not make readability and code maintenance worse.
|
||||
|
||||
For example, you should **never** split long string literals like URLs or some other often copied entities over multiple lines to fit this limit:
|
||||
|
||||
Correct:
|
||||
|
||||
```python
|
||||
'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
|
||||
```
|
||||
|
||||
Incorrect:
|
||||
|
||||
```python
|
||||
'https://www.youtube.com/watch?v=FqZTN594JQw&list='
|
||||
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
|
||||
```
|
||||
|
||||
### Use safe conversion functions
|
||||
|
||||
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
|
||||
|
||||
@@ -79,17 +79,27 @@ class ACastIE(InfoExtractor):
|
||||
|
||||
class ACastChannelIE(InfoExtractor):
|
||||
IE_NAME = 'acast:channel'
|
||||
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<id>[^/#?]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.acast.com/condenasttraveler',
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
(?:www\.)?acast\.com/|
|
||||
play\.acast\.com/s/
|
||||
)
|
||||
(?P<id>[^/#?]+)
|
||||
'''
|
||||
_TESTS = [{
|
||||
'url': 'https://www.acast.com/todayinfocus',
|
||||
'info_dict': {
|
||||
'id': '50544219-29bb-499e-a083-6087f4cb7797',
|
||||
'title': 'Condé Nast Traveler Podcast',
|
||||
'description': 'md5:98646dee22a5b386626ae31866638fbd',
|
||||
'id': '4efc5294-5385-4847-98bd-519799ce5786',
|
||||
'title': 'Today in Focus',
|
||||
'description': 'md5:9ba5564de5ce897faeb12963f4537a64',
|
||||
},
|
||||
'playlist_mincount': 20,
|
||||
}
|
||||
_API_BASE_URL = 'https://www.acast.com/api/'
|
||||
'playlist_mincount': 35,
|
||||
}, {
|
||||
'url': 'http://play.acast.com/s/ft-banking-weekly',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_API_BASE_URL = 'https://play.acast.com/api/'
|
||||
_PAGE_SIZE = 10
|
||||
|
||||
@classmethod
|
||||
@@ -102,7 +112,7 @@ class ACastChannelIE(InfoExtractor):
|
||||
channel_slug, note='Download page %d of channel data' % page)
|
||||
for cast in casts:
|
||||
yield self.url_result(
|
||||
'https://www.acast.com/%s/%s' % (channel_slug, cast['url']),
|
||||
'https://play.acast.com/s/%s/%s' % (channel_slug, cast['url']),
|
||||
'ACast', cast['id'])
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -62,7 +62,7 @@ class AudiomackIE(InfoExtractor):
|
||||
# Audiomack wraps a lot of soundcloud tracks in their branded wrapper
|
||||
# if so, pass the work off to the soundcloud extractor
|
||||
if SoundcloudIE.suitable(api_response['url']):
|
||||
return {'_type': 'url', 'url': api_response['url'], 'ie_key': 'Soundcloud'}
|
||||
return self.url_result(api_response['url'], SoundcloudIE.ie_key())
|
||||
|
||||
return {
|
||||
'id': compat_str(api_response.get('id', album_url_tag)),
|
||||
|
||||
@@ -5,7 +5,10 @@ import itertools
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import urlencode_postdata
|
||||
from ..utils import (
|
||||
orderedSet,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class BitChuteIE(InfoExtractor):
|
||||
@@ -43,10 +46,15 @@ class BitChuteIE(InfoExtractor):
|
||||
'description', webpage, 'title',
|
||||
default=None) or self._og_search_description(webpage)
|
||||
|
||||
format_urls = []
|
||||
for mobj in re.finditer(
|
||||
r'addWebSeed\s*\(\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage):
|
||||
format_urls.append(mobj.group('url'))
|
||||
format_urls.extend(re.findall(r'as=(https?://[^&"\']+)', webpage))
|
||||
|
||||
formats = [
|
||||
{'url': mobj.group('url')}
|
||||
for mobj in re.finditer(
|
||||
r'addWebSeed\s*\(\s*(["\'])(?P<url>(?:(?!\1).)+)\1', webpage)]
|
||||
{'url': format_url}
|
||||
for format_url in orderedSet(format_urls)]
|
||||
self._sort_formats(formats)
|
||||
|
||||
description = self._html_search_regex(
|
||||
|
||||
@@ -14,6 +14,7 @@ class CamModelsIE(InfoExtractor):
|
||||
_TESTS = [{
|
||||
'url': 'https://www.cammodels.com/cam/AutumnKnight/',
|
||||
'only_matching': True,
|
||||
'age_limit': 18
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -93,4 +94,5 @@ class CamModelsIE(InfoExtractor):
|
||||
'title': self._live_title(user_id),
|
||||
'is_live': True,
|
||||
'formats': formats,
|
||||
'age_limit': 18
|
||||
}
|
||||
|
||||
@@ -20,6 +20,7 @@ class CamTubeIE(InfoExtractor):
|
||||
'duration': 1274,
|
||||
'timestamp': 1528018608,
|
||||
'upload_date': '20180603',
|
||||
'age_limit': 18
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -66,4 +67,5 @@ class CamTubeIE(InfoExtractor):
|
||||
'like_count': like_count,
|
||||
'creator': creator,
|
||||
'formats': formats,
|
||||
'age_limit': 18
|
||||
}
|
||||
|
||||
@@ -25,6 +25,7 @@ class CamWithHerIE(InfoExtractor):
|
||||
'comment_count': int,
|
||||
'uploader': 'MileenaK',
|
||||
'upload_date': '20160322',
|
||||
'age_limit': 18,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -84,4 +85,5 @@ class CamWithHerIE(InfoExtractor):
|
||||
'comment_count': comment_count,
|
||||
'uploader': uploader,
|
||||
'upload_date': upload_date,
|
||||
'age_limit': 18
|
||||
}
|
||||
|
||||
@@ -119,11 +119,7 @@ class CNNBlogsIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
webpage = self._download_webpage(url, url_basename(url))
|
||||
cnn_url = self._html_search_regex(r'data-url="(.+?)"', webpage, 'cnn url')
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': cnn_url,
|
||||
'ie_key': CNNIE.ie_key(),
|
||||
}
|
||||
return self.url_result(cnn_url, CNNIE.ie_key())
|
||||
|
||||
|
||||
class CNNArticleIE(InfoExtractor):
|
||||
@@ -145,8 +141,4 @@ class CNNArticleIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
webpage = self._download_webpage(url, url_basename(url))
|
||||
cnn_url = self._html_search_regex(r"video:\s*'([^']+)'", webpage, 'cnn url')
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': 'http://cnn.com/video/?/video/' + cnn_url,
|
||||
'ie_key': CNNIE.ie_key(),
|
||||
}
|
||||
return self.url_result('http://cnn.com/video/?/video/' + cnn_url, CNNIE.ie_key())
|
||||
|
||||
@@ -94,11 +94,12 @@ class DiscoveryIE(DiscoveryGoBaseIE):
|
||||
})['access_token']
|
||||
|
||||
try:
|
||||
headers = self.geo_verification_headers()
|
||||
headers['Authorization'] = 'Bearer ' + access_token
|
||||
|
||||
stream = self._download_json(
|
||||
'https://api.discovery.com/v1/streaming/video/' + video_id,
|
||||
display_id, headers={
|
||||
'Authorization': 'Bearer ' + access_token,
|
||||
})
|
||||
display_id, headers=headers)
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 403):
|
||||
e_description = self._parse_json(
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .youtube import YoutubeIE
|
||||
|
||||
|
||||
class FreespeechIE(InfoExtractor):
|
||||
@@ -27,8 +28,4 @@ class FreespeechIE(InfoExtractor):
|
||||
r'data-video-url="([^"]+)"',
|
||||
webpage, 'youtube url')
|
||||
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': youtube_url,
|
||||
'ie_key': 'Youtube',
|
||||
}
|
||||
return self.url_result(youtube_url, YoutubeIE.ie_key())
|
||||
|
||||
@@ -2197,10 +2197,7 @@ class GenericIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
if url.startswith('//'):
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': self.http_scheme() + url,
|
||||
}
|
||||
return self.url_result(self.http_scheme() + url)
|
||||
|
||||
parsed_url = compat_urlparse.urlparse(url)
|
||||
if not parsed_url.scheme:
|
||||
|
||||
@@ -363,7 +363,4 @@ class LivestreamShortenerIE(InfoExtractor):
|
||||
id = mobj.group('id')
|
||||
webpage = self._download_webpage(url, id)
|
||||
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': self._og_search_url(webpage),
|
||||
}
|
||||
return self.url_result(self._og_search_url(webpage))
|
||||
|
||||
@@ -2,12 +2,18 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
str_to_int,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class ManyVidsIE(InfoExtractor):
|
||||
_VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
# preview video
|
||||
'url': 'https://www.manyvids.com/Video/133957/everthing-about-me/',
|
||||
'md5': '03f11bb21c52dd12a05be21a5c7dcc97',
|
||||
'info_dict': {
|
||||
@@ -17,7 +23,18 @@ class ManyVidsIE(InfoExtractor):
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
}
|
||||
}, {
|
||||
# full video
|
||||
'url': 'https://www.manyvids.com/Video/935718/MY-FACE-REVEAL/',
|
||||
'md5': 'f3e8f7086409e9b470e2643edb96bdcc',
|
||||
'info_dict': {
|
||||
'id': '935718',
|
||||
'ext': 'mp4',
|
||||
'title': 'MY FACE REVEAL',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
@@ -28,12 +45,41 @@ class ManyVidsIE(InfoExtractor):
|
||||
r'data-(?:video-filepath|meta-video)\s*=s*(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
webpage, 'video URL', group='url')
|
||||
|
||||
title = '%s (Preview)' % self._html_search_regex(
|
||||
r'<h2[^>]+class="m-a-0"[^>]*>([^<]+)', webpage, 'title')
|
||||
title = self._html_search_regex(
|
||||
(r'<span[^>]+class=["\']item-title[^>]+>([^<]+)',
|
||||
r'<h2[^>]+class=["\']h2 m-0["\'][^>]*>([^<]+)'),
|
||||
webpage, 'title', default=None) or self._html_search_meta(
|
||||
'twitter:title', webpage, 'title', fatal=True)
|
||||
|
||||
if any(p in webpage for p in ('preview_videos', '_preview.mp4')):
|
||||
title += ' (Preview)'
|
||||
|
||||
mv_token = self._search_regex(
|
||||
r'data-mvtoken=(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
|
||||
'mv token', default=None, group='value')
|
||||
|
||||
if mv_token:
|
||||
# Sets some cookies
|
||||
self._download_webpage(
|
||||
'https://www.manyvids.com/includes/ajax_repository/you_had_me_at_hello.php',
|
||||
video_id, fatal=False, data=urlencode_postdata({
|
||||
'mvtoken': mv_token,
|
||||
'vid': video_id,
|
||||
}), headers={
|
||||
'Referer': url,
|
||||
'X-Requested-With': 'XMLHttpRequest'
|
||||
})
|
||||
|
||||
if determine_ext(video_url) == 'm3u8':
|
||||
formats = self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls')
|
||||
else:
|
||||
formats = [{'url': video_url}]
|
||||
|
||||
like_count = int_or_none(self._search_regex(
|
||||
r'data-likes=["\'](\d+)', webpage, 'like count', default=None))
|
||||
view_count = int_or_none(self._html_search_regex(
|
||||
view_count = str_to_int(self._html_search_regex(
|
||||
r'(?s)<span[^>]+class="views-wrapper"[^>]*>(.+?)</span', webpage,
|
||||
'view count', default=None))
|
||||
|
||||
@@ -42,7 +88,5 @@ class ManyVidsIE(InfoExtractor):
|
||||
'title': title,
|
||||
'view_count': view_count,
|
||||
'like_count': like_count,
|
||||
'formats': [{
|
||||
'url': video_url,
|
||||
}],
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
@@ -24,9 +24,9 @@ class PacktPubBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class PacktPubIE(PacktPubBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?packtpub\.com/mapt/video/[^/]+/(?P<course_id>\d+)/(?P<chapter_id>\d+)/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:(?:www\.)?packtpub\.com/mapt|subscription\.packtpub\.com)/video/[^/]+/(?P<course_id>\d+)/(?P<chapter_id>\d+)/(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'https://www.packtpub.com/mapt/video/web-development/9781787122215/20528/20530/Project+Intro',
|
||||
'md5': '1e74bd6cfd45d7d07666f4684ef58f70',
|
||||
'info_dict': {
|
||||
@@ -37,7 +37,10 @@ class PacktPubIE(PacktPubBaseIE):
|
||||
'timestamp': 1490918400,
|
||||
'upload_date': '20170331',
|
||||
},
|
||||
}
|
||||
}, {
|
||||
'url': 'https://subscription.packtpub.com/video/web_development/9781787122215/20528/20530/project-intro',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_NETRC_MACHINE = 'packtpub'
|
||||
_TOKEN = None
|
||||
|
||||
@@ -110,15 +113,18 @@ class PacktPubIE(PacktPubBaseIE):
|
||||
|
||||
|
||||
class PacktPubCourseIE(PacktPubBaseIE):
|
||||
_VALID_URL = r'(?P<url>https?://(?:www\.)?packtpub\.com/mapt/video/[^/]+/(?P<id>\d+))'
|
||||
_TEST = {
|
||||
_VALID_URL = r'(?P<url>https?://(?:(?:www\.)?packtpub\.com/mapt|subscription\.packtpub\.com)/video/[^/]+/(?P<id>\d+))'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.packtpub.com/mapt/video/web-development/9781787122215',
|
||||
'info_dict': {
|
||||
'id': '9781787122215',
|
||||
'title': 'Learn Nodejs by building 12 projects [Video]',
|
||||
},
|
||||
'playlist_count': 90,
|
||||
}
|
||||
}, {
|
||||
'url': 'https://subscription.packtpub.com/video/web_development/9781787122215',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
|
||||
@@ -1,38 +1,46 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .brightcove import BrightcoveLegacyIE
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import smuggle_url
|
||||
|
||||
|
||||
class RMCDecouverteIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/mediaplayer-replay.*?\bid=(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/(?:(?:[^/]+/)*program_(?P<id>\d+)|(?P<live_id>mediaplayer-direct))'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://rmcdecouverte.bfmtv.com/mediaplayer-replay/?id=13502&title=AQUAMEN:LES%20ROIS%20DES%20AQUARIUMS%20:UN%20DELICIEUX%20PROJET',
|
||||
_TESTS = [{
|
||||
'url': 'https://rmcdecouverte.bfmtv.com/wheeler-dealers-occasions-a-saisir/program_2566/',
|
||||
'info_dict': {
|
||||
'id': '5419055995001',
|
||||
'id': '5983675500001',
|
||||
'ext': 'mp4',
|
||||
'title': 'UN DELICIEUX PROJET',
|
||||
'description': 'md5:63610df7c8b1fc1698acd4d0d90ba8b5',
|
||||
'title': 'CORVETTE',
|
||||
'description': 'md5:c1e8295521e45ffebf635d6a7658f506',
|
||||
'uploader_id': '1969646226001',
|
||||
'upload_date': '20170502',
|
||||
'timestamp': 1493745308,
|
||||
'upload_date': '20181226',
|
||||
'timestamp': 1545861635,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'only available for a week',
|
||||
}
|
||||
}, {
|
||||
# live, geo restricted, bypassable
|
||||
'url': 'https://rmcdecouverte.bfmtv.com/mediaplayer-direct/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1969646226001/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
display_id = mobj.group('id') or mobj.group('live_id')
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
if brightcove_legacy_url:
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(
|
||||
@@ -41,5 +49,7 @@ class RMCDecouverteIE(InfoExtractor):
|
||||
brightcove_id = self._search_regex(
|
||||
r'data-video-id=["\'](\d+)', webpage, 'brightcove id')
|
||||
return self.url_result(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew',
|
||||
brightcove_id)
|
||||
smuggle_url(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
|
||||
{'geo_countries': ['FR']}),
|
||||
'BrightcoveNew', brightcove_id)
|
||||
|
||||
@@ -30,8 +30,5 @@ class SaveFromIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = os.path.splitext(url.split('/')[-1])[0]
|
||||
return {
|
||||
'_type': 'url',
|
||||
'id': video_id,
|
||||
'url': mobj.group('url'),
|
||||
}
|
||||
|
||||
return self.url_result(mobj.group('url'), video_id=video_id)
|
||||
|
||||
@@ -203,10 +203,8 @@ class TEDIE(InfoExtractor):
|
||||
ext_url = None
|
||||
if service.lower() == 'youtube':
|
||||
ext_url = external.get('code')
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': ext_url or external['uri'],
|
||||
}
|
||||
|
||||
return self.url_result(ext_url or external['uri'])
|
||||
|
||||
resources_ = player_talk.get('resources') or talk_info.get('resources')
|
||||
|
||||
|
||||
@@ -61,8 +61,4 @@ class TestURLIE(InfoExtractor):
|
||||
|
||||
self.to_screen('Test URL: %s' % tc['url'])
|
||||
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': tc['url'],
|
||||
'id': video_id,
|
||||
}
|
||||
return self.url_result(tc['url'], video_id=video_id)
|
||||
|
||||
@@ -40,11 +40,7 @@ class WimpIE(InfoExtractor):
|
||||
r'data-id=["\']([0-9A-Za-z_-]{11})'),
|
||||
webpage, 'video URL', default=None)
|
||||
if youtube_id:
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': youtube_id,
|
||||
'ie_key': YoutubeIE.ie_key(),
|
||||
}
|
||||
return self.url_result(youtube_id, YoutubeIE.ie_key())
|
||||
|
||||
info_dict = self._extract_jwplayer_data(
|
||||
webpage, video_id, require_title=False)
|
||||
|
||||
@@ -14,6 +14,7 @@ class YourPornIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:c9f43630bd968267672651ba905a7d35',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'age_limit': 18
|
||||
},
|
||||
}
|
||||
|
||||
@@ -26,7 +27,7 @@ class YourPornIE(InfoExtractor):
|
||||
self._search_regex(
|
||||
r'data-vnfo=(["\'])(?P<data>{.+?})\1', webpage, 'data info',
|
||||
group='data'),
|
||||
video_id)[video_id]).replace('/cdn/', '/cdn2/')
|
||||
video_id)[video_id]).replace('/cdn/', '/cdn3/')
|
||||
|
||||
title = (self._search_regex(
|
||||
r'<[^>]+\bclass=["\']PostEditTA[^>]+>([^<]+)', webpage, 'title',
|
||||
@@ -38,4 +39,5 @@ class YourPornIE(InfoExtractor):
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
'age_limit': 18
|
||||
}
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2018.12.31'
|
||||
__version__ = '2019.01.02'
|
||||
|
||||
Reference in New Issue
Block a user