mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2025-04-22 08:44:03 +00:00
Compare commits
No commits in common. "master" and "2025.03.25" have entirely different histories.
master
...
2025.03.25
@ -758,5 +758,3 @@ somini
|
|||||||
thedenv
|
thedenv
|
||||||
vallovic
|
vallovic
|
||||||
arabcoders
|
arabcoders
|
||||||
mireq
|
|
||||||
mlabeeb03
|
|
||||||
|
36
Changelog.md
36
Changelog.md
@ -4,42 +4,6 @@
|
|||||||
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
||||||
-->
|
-->
|
||||||
|
|
||||||
### 2025.03.31
|
|
||||||
|
|
||||||
#### Core changes
|
|
||||||
- [Add `--compat-options 2024`](https://github.com/yt-dlp/yt-dlp/commit/22e34adbd741e1c7072015debd615dc3fb71c401) ([#12789](https://github.com/yt-dlp/yt-dlp/issues/12789)) by [seproDev](https://github.com/seproDev)
|
|
||||||
|
|
||||||
#### Extractor changes
|
|
||||||
- **francaisfacile**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/bb321cfdc3fd4400598ddb12a15862bc2ac8fc10) ([#12787](https://github.com/yt-dlp/yt-dlp/issues/12787)) by [mlabeeb03](https://github.com/mlabeeb03)
|
|
||||||
- **generic**: [Validate response before checking m3u8 live status](https://github.com/yt-dlp/yt-dlp/commit/9a1ec1d36e172d252714cef712a6d091e0a0c4f2) ([#12784](https://github.com/yt-dlp/yt-dlp/issues/12784)) by [bashonly](https://github.com/bashonly)
|
|
||||||
- **microsoftlearnepisode**: [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/d63696f23a341ee36a3237ccb5d5e14b34c2c579) ([#12799](https://github.com/yt-dlp/yt-dlp/issues/12799)) by [bashonly](https://github.com/bashonly)
|
|
||||||
- **mlbtv**: [Fix radio-only extraction](https://github.com/yt-dlp/yt-dlp/commit/f033d86b96b36f8c5289dd7c3304f42d4d9f6ff4) ([#12792](https://github.com/yt-dlp/yt-dlp/issues/12792)) by [bashonly](https://github.com/bashonly)
|
|
||||||
- **on24**: [Support `mainEvent` URLs](https://github.com/yt-dlp/yt-dlp/commit/e465b078ead75472fcb7b86f6ccaf2b5d3bc4c21) ([#12800](https://github.com/yt-dlp/yt-dlp/issues/12800)) by [bashonly](https://github.com/bashonly)
|
|
||||||
- **sbs**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/29560359120f28adaaac67c86fa8442eb72daa0d) ([#12785](https://github.com/yt-dlp/yt-dlp/issues/12785)) by [bashonly](https://github.com/bashonly)
|
|
||||||
- **stvr**: [Rename extractor from RTVS to STVR](https://github.com/yt-dlp/yt-dlp/commit/5fc521cbd0ce7b2410d0935369558838728e205d) ([#12788](https://github.com/yt-dlp/yt-dlp/issues/12788)) by [mireq](https://github.com/mireq)
|
|
||||||
- **twitch**: clips: [Extract portrait formats](https://github.com/yt-dlp/yt-dlp/commit/61046c31612b30c749cbdae934b7fe26abe659d7) ([#12763](https://github.com/yt-dlp/yt-dlp/issues/12763)) by [DmitryScaletta](https://github.com/DmitryScaletta)
|
|
||||||
- **youtube**
|
|
||||||
- [Add `player_js_variant` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/07f04005e40ebdb368920c511e36e98af0077ed3) ([#12767](https://github.com/yt-dlp/yt-dlp/issues/12767)) by [bashonly](https://github.com/bashonly)
|
|
||||||
- tab: [Fix playlist continuation extraction](https://github.com/yt-dlp/yt-dlp/commit/6a6d97b2cbc78f818de05cc96edcdcfd52caa259) ([#12777](https://github.com/yt-dlp/yt-dlp/issues/12777)) by [coletdjnz](https://github.com/coletdjnz)
|
|
||||||
|
|
||||||
#### Misc. changes
|
|
||||||
- **cleanup**: Miscellaneous: [5e457af](https://github.com/yt-dlp/yt-dlp/commit/5e457af57fae9645b1b8fa0ed689229c8fb9656b) by [bashonly](https://github.com/bashonly)
|
|
||||||
|
|
||||||
### 2025.03.27
|
|
||||||
|
|
||||||
#### Core changes
|
|
||||||
- **jsinterp**: [Fix nested attributes and object extraction](https://github.com/yt-dlp/yt-dlp/commit/a8b9ff3c2a0ae25735e580173becc78545b92572) ([#12760](https://github.com/yt-dlp/yt-dlp/issues/12760)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
|
|
||||||
|
|
||||||
#### Extractor changes
|
|
||||||
- **youtube**: [Make signature and nsig extraction more robust](https://github.com/yt-dlp/yt-dlp/commit/48be862b32648bff5b3e553e40fca4dcc6e88b28) ([#12761](https://github.com/yt-dlp/yt-dlp/issues/12761)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
|
|
||||||
|
|
||||||
### 2025.03.26
|
|
||||||
|
|
||||||
#### Extractor changes
|
|
||||||
- **youtube**
|
|
||||||
- [Fix signature and nsig extraction for player `4fcd6e4a`](https://github.com/yt-dlp/yt-dlp/commit/a550dfc904a02843a26369ae50dbb7c0febfb30e) ([#12748](https://github.com/yt-dlp/yt-dlp/issues/12748)) by [seproDev](https://github.com/seproDev)
|
|
||||||
- [Only cache nsig code on successful decoding](https://github.com/yt-dlp/yt-dlp/commit/ecee97b4fa90d51c48f9154c3a6d5a8ffe46cd5c) ([#12750](https://github.com/yt-dlp/yt-dlp/issues/12750)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
|
|
||||||
|
|
||||||
### 2025.03.25
|
### 2025.03.25
|
||||||
|
|
||||||
#### Core changes
|
#### Core changes
|
||||||
|
10
README.md
10
README.md
@ -1770,7 +1770,7 @@ The following extractors use this feature:
|
|||||||
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
|
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
|
||||||
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
|
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
|
||||||
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
|
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
|
||||||
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
|
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
|
||||||
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
|
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
|
||||||
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
|
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
|
||||||
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
|
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
|
||||||
@ -1782,7 +1782,6 @@ The following extractors use this feature:
|
|||||||
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
|
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
|
||||||
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
|
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
|
||||||
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
|
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
|
||||||
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`
|
|
||||||
|
|
||||||
#### youtubetab (YouTube playlists, channels, feeds, etc.)
|
#### youtubetab (YouTube playlists, channels, feeds, etc.)
|
||||||
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
|
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
|
||||||
@ -2219,7 +2218,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
|
|||||||
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
|
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
|
||||||
* YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
|
* YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
|
||||||
* Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
|
* Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
|
||||||
* The upload dates extracted from YouTube are in UTC.
|
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
|
||||||
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
|
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
|
||||||
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
|
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
|
||||||
* Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
|
* Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
|
||||||
@ -2238,10 +2237,9 @@ For ease of use, a few more compat options are available:
|
|||||||
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
|
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
|
||||||
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
||||||
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
||||||
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
|
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
|
||||||
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
|
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
|
||||||
* `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
|
* `--compat-options 2023`: Same as `--compat-options prefer-vp9-sort`. Use this to enable all future compat options
|
||||||
* `--compat-options 2024`: Currently does nothing. Use this to enable all future compat options
|
|
||||||
|
|
||||||
The following compat options restore vulnerable behavior from before security patches:
|
The following compat options restore vulnerable behavior from before security patches:
|
||||||
|
|
||||||
|
@ -472,7 +472,6 @@ The only reliable way to check if a site is supported is to try it.
|
|||||||
- **FoxNewsVideo**
|
- **FoxNewsVideo**
|
||||||
- **FoxSports**
|
- **FoxSports**
|
||||||
- **fptplay**: fptplay.vn
|
- **fptplay**: fptplay.vn
|
||||||
- **FrancaisFacile**
|
|
||||||
- **FranceCulture**
|
- **FranceCulture**
|
||||||
- **FranceInter**
|
- **FranceInter**
|
||||||
- **francetv**
|
- **francetv**
|
||||||
@ -1252,6 +1251,7 @@ The only reliable way to check if a site is supported is to try it.
|
|||||||
- **rtve.es:infantil**: RTVE infantil
|
- **rtve.es:infantil**: RTVE infantil
|
||||||
- **rtve.es:live**: RTVE.es live streams
|
- **rtve.es:live**: RTVE.es live streams
|
||||||
- **rtve.es:television**
|
- **rtve.es:television**
|
||||||
|
- **RTVS**
|
||||||
- **rtvslo.si**
|
- **rtvslo.si**
|
||||||
- **rtvslo.si:show**
|
- **rtvslo.si:show**
|
||||||
- **RudoVideo**
|
- **RudoVideo**
|
||||||
@ -1407,7 +1407,6 @@ The only reliable way to check if a site is supported is to try it.
|
|||||||
- **StretchInternet**
|
- **StretchInternet**
|
||||||
- **Stripchat**
|
- **Stripchat**
|
||||||
- **stv:player**
|
- **stv:player**
|
||||||
- **stvr**: Slovak Television and Radio (formerly RTVS)
|
|
||||||
- **Subsplash**
|
- **Subsplash**
|
||||||
- **subsplash:playlist**
|
- **subsplash:playlist**
|
||||||
- **Substack**
|
- **Substack**
|
||||||
|
@ -136,7 +136,7 @@ def _iter_differences(got, expected, field):
|
|||||||
return
|
return
|
||||||
|
|
||||||
if op == 'startswith':
|
if op == 'startswith':
|
||||||
if not got.startswith(val):
|
if not val.startswith(got):
|
||||||
yield field, f'should start with {val!r}, got {got!r}'
|
yield field, f'should start with {val!r}, got {got!r}'
|
||||||
return
|
return
|
||||||
|
|
||||||
|
@ -118,7 +118,6 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31)
|
self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31)
|
||||||
self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51)
|
self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51)
|
||||||
self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11)
|
self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11)
|
||||||
self._test('function f(){var x = 2; var y = ["a", "b"]; y[x%y["length"]]="z"; return y}', ['z', 'b'])
|
|
||||||
|
|
||||||
@unittest.skip('Not implemented')
|
@unittest.skip('Not implemented')
|
||||||
def test_comments(self):
|
def test_comments(self):
|
||||||
@ -404,8 +403,6 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
test_result = list('test')
|
test_result = list('test')
|
||||||
tests = [
|
tests = [
|
||||||
'function f(a, b){return a.split(b)}',
|
'function f(a, b){return a.split(b)}',
|
||||||
'function f(a, b){return a["split"](b)}',
|
|
||||||
'function f(a, b){let x = ["split"]; return a[x[0]](b)}',
|
|
||||||
'function f(a, b){return String.prototype.split.call(a, b)}',
|
'function f(a, b){return String.prototype.split.call(a, b)}',
|
||||||
'function f(a, b){return String.prototype.split.apply(a, [b])}',
|
'function f(a, b){return String.prototype.split.apply(a, [b])}',
|
||||||
]
|
]
|
||||||
@ -444,9 +441,6 @@ class TestJSInterpreter(unittest.TestCase):
|
|||||||
self._test('function f(){return "012345678".slice(-1, 1)}', '')
|
self._test('function f(){return "012345678".slice(-1, 1)}', '')
|
||||||
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
|
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
|
||||||
|
|
||||||
def test_splice(self):
|
|
||||||
self._test('function f(){var T = ["0", "1", "2"]; T["splice"](2, 1, "0")[0]; return T }', ['0', '1', '0'])
|
|
||||||
|
|
||||||
def test_js_number_to_string(self):
|
def test_js_number_to_string(self):
|
||||||
for test, radix, expected in [
|
for test, radix, expected in [
|
||||||
(0, None, '0'),
|
(0, None, '0'),
|
||||||
|
@ -39,7 +39,6 @@ from yt_dlp.cookies import YoutubeDLCookieJar
|
|||||||
from yt_dlp.dependencies import brotli, curl_cffi, requests, urllib3
|
from yt_dlp.dependencies import brotli, curl_cffi, requests, urllib3
|
||||||
from yt_dlp.networking import (
|
from yt_dlp.networking import (
|
||||||
HEADRequest,
|
HEADRequest,
|
||||||
PATCHRequest,
|
|
||||||
PUTRequest,
|
PUTRequest,
|
||||||
Request,
|
Request,
|
||||||
RequestDirector,
|
RequestDirector,
|
||||||
@ -1857,7 +1856,6 @@ class TestRequest:
|
|||||||
|
|
||||||
def test_request_helpers(self):
|
def test_request_helpers(self):
|
||||||
assert HEADRequest('http://example.com').method == 'HEAD'
|
assert HEADRequest('http://example.com').method == 'HEAD'
|
||||||
assert PATCHRequest('http://example.com').method == 'PATCH'
|
|
||||||
assert PUTRequest('http://example.com').method == 'PUT'
|
assert PUTRequest('http://example.com').method == 'PUT'
|
||||||
|
|
||||||
def test_headers(self):
|
def test_headers(self):
|
||||||
|
@ -659,8 +659,6 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
|
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
|
||||||
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
|
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
|
||||||
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
|
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
|
||||||
self.assertEqual(url_or_none('ws://foo.de'), 'ws://foo.de')
|
|
||||||
self.assertEqual(url_or_none('wss://foo.de'), 'wss://foo.de')
|
|
||||||
|
|
||||||
def test_parse_age_limit(self):
|
def test_parse_age_limit(self):
|
||||||
self.assertEqual(parse_age_limit(None), None)
|
self.assertEqual(parse_age_limit(None), None)
|
||||||
|
@ -88,51 +88,6 @@ _SIG_TESTS = [
|
|||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||||
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||||
),
|
),
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/363db69b/player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/player_ias.vflset/en_US/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
|
|
||||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
|
||||||
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
|
|
||||||
),
|
|
||||||
]
|
]
|
||||||
|
|
||||||
_NSIG_TESTS = [
|
_NSIG_TESTS = [
|
||||||
@ -288,34 +243,6 @@ _NSIG_TESTS = [
|
|||||||
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
|
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
|
||||||
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
|
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
|
||||||
),
|
),
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
|
|
||||||
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/tv-player-ias.vflset/tv-player-ias.js',
|
|
||||||
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
|
|
||||||
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
|
|
||||||
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
|
|
||||||
),
|
|
||||||
(
|
|
||||||
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
|
|
||||||
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
|
|
||||||
),
|
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
@ -366,33 +293,33 @@ def t_factory(name, sig_func, url_pattern):
|
|||||||
test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
|
test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
|
||||||
|
|
||||||
def test_func(self):
|
def test_func(self):
|
||||||
basename = f'player-{test_id}.js'
|
basename = f'player-{name}-{test_id}.js'
|
||||||
fn = os.path.join(self.TESTDATA_DIR, basename)
|
fn = os.path.join(self.TESTDATA_DIR, basename)
|
||||||
|
|
||||||
if not os.path.exists(fn):
|
if not os.path.exists(fn):
|
||||||
urllib.request.urlretrieve(url, fn)
|
urllib.request.urlretrieve(url, fn)
|
||||||
with open(fn, encoding='utf-8') as testf:
|
with open(fn, encoding='utf-8') as testf:
|
||||||
jscode = testf.read()
|
jscode = testf.read()
|
||||||
self.assertEqual(sig_func(jscode, sig_input, url), expected_sig)
|
self.assertEqual(sig_func(jscode, sig_input), expected_sig)
|
||||||
|
|
||||||
test_func.__name__ = f'test_{name}_js_{test_id}'
|
test_func.__name__ = f'test_{name}_js_{test_id}'
|
||||||
setattr(TestSignature, test_func.__name__, test_func)
|
setattr(TestSignature, test_func.__name__, test_func)
|
||||||
return make_tfunc
|
return make_tfunc
|
||||||
|
|
||||||
|
|
||||||
def signature(jscode, sig_input, player_url):
|
def signature(jscode, sig_input):
|
||||||
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode, player_url)
|
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode)
|
||||||
src_sig = (
|
src_sig = (
|
||||||
str(string.printable[:sig_input])
|
str(string.printable[:sig_input])
|
||||||
if isinstance(sig_input, int) else sig_input)
|
if isinstance(sig_input, int) else sig_input)
|
||||||
return func(src_sig)
|
return func(src_sig)
|
||||||
|
|
||||||
|
|
||||||
def n_sig(jscode, sig_input, player_url):
|
def n_sig(jscode, sig_input):
|
||||||
ie = YoutubeIE(FakeYDL())
|
ie = YoutubeIE(FakeYDL())
|
||||||
funcname = ie._extract_n_function_name(jscode, player_url=player_url)
|
funcname = ie._extract_n_function_name(jscode)
|
||||||
jsi = JSInterpreter(jscode)
|
jsi = JSInterpreter(jscode)
|
||||||
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode, player_url))
|
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode))
|
||||||
return func([sig_input])
|
return func([sig_input])
|
||||||
|
|
||||||
|
|
||||||
|
@ -85,7 +85,6 @@ class NiconicoLiveFD(FileDownloader):
|
|||||||
'quality': live_quality,
|
'quality': live_quality,
|
||||||
'protocol': 'hls+fmp4',
|
'protocol': 'hls+fmp4',
|
||||||
'latency': live_latency,
|
'latency': live_latency,
|
||||||
'accessRightMethod': 'single_cookie',
|
|
||||||
'chasePlay': False,
|
'chasePlay': False,
|
||||||
},
|
},
|
||||||
'room': {
|
'room': {
|
||||||
|
@ -683,7 +683,6 @@ from .foxnews import (
|
|||||||
)
|
)
|
||||||
from .foxsports import FoxSportsIE
|
from .foxsports import FoxSportsIE
|
||||||
from .fptplay import FptplayIE
|
from .fptplay import FptplayIE
|
||||||
from .francaisfacile import FrancaisFacileIE
|
|
||||||
from .franceinter import FranceInterIE
|
from .franceinter import FranceInterIE
|
||||||
from .francetv import (
|
from .francetv import (
|
||||||
FranceTVIE,
|
FranceTVIE,
|
||||||
@ -903,7 +902,6 @@ from .ivi import (
|
|||||||
IviIE,
|
IviIE,
|
||||||
)
|
)
|
||||||
from .ivideon import IvideonIE
|
from .ivideon import IvideonIE
|
||||||
from .ivoox import IvooxIE
|
|
||||||
from .iwara import (
|
from .iwara import (
|
||||||
IwaraIE,
|
IwaraIE,
|
||||||
IwaraPlaylistIE,
|
IwaraPlaylistIE,
|
||||||
@ -961,10 +959,7 @@ from .kick import (
|
|||||||
)
|
)
|
||||||
from .kicker import KickerIE
|
from .kicker import KickerIE
|
||||||
from .kickstarter import KickStarterIE
|
from .kickstarter import KickStarterIE
|
||||||
from .kika import (
|
from .kika import KikaIE
|
||||||
KikaIE,
|
|
||||||
KikaPlaylistIE,
|
|
||||||
)
|
|
||||||
from .kinja import KinjaEmbedIE
|
from .kinja import KinjaEmbedIE
|
||||||
from .kinopoisk import KinoPoiskIE
|
from .kinopoisk import KinoPoiskIE
|
||||||
from .kommunetv import KommunetvIE
|
from .kommunetv import KommunetvIE
|
||||||
@ -1065,7 +1060,6 @@ from .loom import (
|
|||||||
from .lovehomeporn import LoveHomePornIE
|
from .lovehomeporn import LoveHomePornIE
|
||||||
from .lrt import (
|
from .lrt import (
|
||||||
LRTVODIE,
|
LRTVODIE,
|
||||||
LRTRadioIE,
|
|
||||||
LRTStreamIE,
|
LRTStreamIE,
|
||||||
)
|
)
|
||||||
from .lsm import (
|
from .lsm import (
|
||||||
@ -1498,10 +1492,6 @@ from .paramountplus import (
|
|||||||
)
|
)
|
||||||
from .parler import ParlerIE
|
from .parler import ParlerIE
|
||||||
from .parlview import ParlviewIE
|
from .parlview import ParlviewIE
|
||||||
from .parti import (
|
|
||||||
PartiLivestreamIE,
|
|
||||||
PartiVideoIE,
|
|
||||||
)
|
|
||||||
from .patreon import (
|
from .patreon import (
|
||||||
PatreonCampaignIE,
|
PatreonCampaignIE,
|
||||||
PatreonIE,
|
PatreonIE,
|
||||||
@ -1748,7 +1738,6 @@ from .roosterteeth import (
|
|||||||
RoosterTeethSeriesIE,
|
RoosterTeethSeriesIE,
|
||||||
)
|
)
|
||||||
from .rottentomatoes import RottenTomatoesIE
|
from .rottentomatoes import RottenTomatoesIE
|
||||||
from .roya import RoyaLiveIE
|
|
||||||
from .rozhlas import (
|
from .rozhlas import (
|
||||||
MujRozhlasIE,
|
MujRozhlasIE,
|
||||||
RozhlasIE,
|
RozhlasIE,
|
||||||
@ -1783,6 +1772,7 @@ from .rtvcplay import (
|
|||||||
from .rtve import (
|
from .rtve import (
|
||||||
RTVEALaCartaIE,
|
RTVEALaCartaIE,
|
||||||
RTVEAudioIE,
|
RTVEAudioIE,
|
||||||
|
RTVEInfantilIE,
|
||||||
RTVELiveIE,
|
RTVELiveIE,
|
||||||
RTVETelevisionIE,
|
RTVETelevisionIE,
|
||||||
)
|
)
|
||||||
@ -2236,10 +2226,7 @@ from .tvplay import (
|
|||||||
TVPlayIE,
|
TVPlayIE,
|
||||||
)
|
)
|
||||||
from .tvplayer import TVPlayerIE
|
from .tvplayer import TVPlayerIE
|
||||||
from .tvw import (
|
from .tvw import TvwIE
|
||||||
TvwIE,
|
|
||||||
TvwTvChannelsIE,
|
|
||||||
)
|
|
||||||
from .tweakers import TweakersIE
|
from .tweakers import TweakersIE
|
||||||
from .twentymin import TwentyMinutenIE
|
from .twentymin import TwentyMinutenIE
|
||||||
from .twentythreevideo import TwentyThreeVideoIE
|
from .twentythreevideo import TwentyThreeVideoIE
|
||||||
|
@ -21,7 +21,6 @@ from ..utils import (
|
|||||||
int_or_none,
|
int_or_none,
|
||||||
time_seconds,
|
time_seconds,
|
||||||
traverse_obj,
|
traverse_obj,
|
||||||
update_url,
|
|
||||||
update_url_query,
|
update_url_query,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -418,10 +417,6 @@ class AbemaTVIE(AbemaTVBaseIE):
|
|||||||
'is_live': is_live,
|
'is_live': is_live,
|
||||||
'availability': availability,
|
'availability': availability,
|
||||||
})
|
})
|
||||||
|
|
||||||
if thumbnail := update_url(self._og_search_thumbnail(webpage, default=''), query=None):
|
|
||||||
info['thumbnails'] = [{'url': thumbnail}]
|
|
||||||
|
|
||||||
return info
|
return info
|
||||||
|
|
||||||
|
|
||||||
|
@ -146,7 +146,7 @@ class TokFMPodcastIE(InfoExtractor):
|
|||||||
'url': 'https://audycje.tokfm.pl/podcast/91275,-Systemowy-rasizm-Czy-zamieszki-w-USA-po-morderstwie-w-Minneapolis-doprowadza-do-zmian-w-sluzbach-panstwowych',
|
'url': 'https://audycje.tokfm.pl/podcast/91275,-Systemowy-rasizm-Czy-zamieszki-w-USA-po-morderstwie-w-Minneapolis-doprowadza-do-zmian-w-sluzbach-panstwowych',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '91275',
|
'id': '91275',
|
||||||
'ext': 'mp3',
|
'ext': 'aac',
|
||||||
'title': 'md5:a9b15488009065556900169fb8061cce',
|
'title': 'md5:a9b15488009065556900169fb8061cce',
|
||||||
'episode': 'md5:a9b15488009065556900169fb8061cce',
|
'episode': 'md5:a9b15488009065556900169fb8061cce',
|
||||||
'series': 'Analizy',
|
'series': 'Analizy',
|
||||||
@ -164,20 +164,23 @@ class TokFMPodcastIE(InfoExtractor):
|
|||||||
raise ExtractorError('No such podcast', expected=True)
|
raise ExtractorError('No such podcast', expected=True)
|
||||||
metadata = metadata[0]
|
metadata = metadata[0]
|
||||||
|
|
||||||
mp3_url = self._download_json(
|
formats = []
|
||||||
'https://api.podcast.radioagora.pl/api4/getSongUrl',
|
for ext in ('aac', 'mp3'):
|
||||||
media_id, 'Downloading podcast mp3 URL', query={
|
url_data = self._download_json(
|
||||||
'podcast_id': media_id,
|
f'https://api.podcast.radioagora.pl/api4/getSongUrl?podcast_id={media_id}&device_id={uuid.uuid4()}&ppre=false&audio={ext}',
|
||||||
'device_id': str(uuid.uuid4()),
|
media_id, f'Downloading podcast {ext} URL')
|
||||||
'ppre': 'false',
|
# prevents inserting the mp3 (default) multiple times
|
||||||
'audio': 'mp3',
|
if 'link_ssl' in url_data and f'.{ext}' in url_data['link_ssl']:
|
||||||
})['link_ssl']
|
formats.append({
|
||||||
|
'url': url_data['link_ssl'],
|
||||||
|
'ext': ext,
|
||||||
|
'vcodec': 'none',
|
||||||
|
'acodec': ext,
|
||||||
|
})
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': media_id,
|
'id': media_id,
|
||||||
'url': mp3_url,
|
'formats': formats,
|
||||||
'vcodec': 'none',
|
|
||||||
'ext': 'mp3',
|
|
||||||
'title': metadata.get('podcast_name'),
|
'title': metadata.get('podcast_name'),
|
||||||
'series': metadata.get('series_name'),
|
'series': metadata.get('series_name'),
|
||||||
'episode': metadata.get('podcast_name'),
|
'episode': metadata.get('podcast_name'),
|
||||||
|
@ -1,105 +1,64 @@
|
|||||||
import urllib.parse
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..networking.exceptions import HTTPError
|
from ..networking.exceptions import HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_age_limit,
|
|
||||||
url_or_none,
|
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class AtresPlayerIE(InfoExtractor):
|
class AtresPlayerIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/(?:[^/?#]+/){4}(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
|
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
|
||||||
_NETRC_MACHINE = 'atresplayer'
|
_NETRC_MACHINE = 'atresplayer'
|
||||||
_TESTS = [{
|
_TESTS = [
|
||||||
'url': 'https://www.atresplayer.com/lasexta/programas/el-objetivo/clips/mbappe-describe-como-entrenador-a-carlo-ancelotti-sabe-cuando-tiene-que-ser-padre-jefe-amigo-entrenador_67f2dfb2fb6ab0e4c7203849/',
|
{
|
||||||
'info_dict': {
|
'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/',
|
||||||
'ext': 'mp4',
|
'info_dict': {
|
||||||
'id': '67f2dfb2fb6ab0e4c7203849',
|
'id': '5d4aa2c57ed1a88fc715a615',
|
||||||
'display_id': 'md5:c203f8d4e425ed115ba56a1c6e4b3e6c',
|
'ext': 'mp4',
|
||||||
'title': 'Mbappé describe como entrenador a Carlo Ancelotti: "Sabe cuándo tiene que ser padre, jefe, amigo, entrenador..."',
|
'title': 'Capítulo 7: Asuntos pendientes',
|
||||||
'channel': 'laSexta',
|
'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc',
|
||||||
'duration': 31,
|
'duration': 3413,
|
||||||
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages02/2025/04/06/B02DBE1E-D59B-4683-8404-1A9595D15269/1920x1080.jpg',
|
},
|
||||||
'tags': ['Entrevista informativa', 'Actualidad', 'Debate informativo', 'Política', 'Economía', 'Sociedad', 'Cara a cara', 'Análisis', 'Más periodismo'],
|
'skip': 'This video is only available for registered users',
|
||||||
'series': 'El Objetivo',
|
|
||||||
'season': 'Temporada 12',
|
|
||||||
'timestamp': 1743970079,
|
|
||||||
'upload_date': '20250406',
|
|
||||||
},
|
},
|
||||||
}, {
|
{
|
||||||
'url': 'https://www.atresplayer.com/antena3/programas/el-hormiguero/clips/revive-la-entrevista-completa-a-miguel-bose-en-el-hormiguero_67f836baa4a5b0e4147ca59a/',
|
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/',
|
||||||
'info_dict': {
|
'only_matching': True,
|
||||||
'ext': 'mp4',
|
|
||||||
'id': '67f836baa4a5b0e4147ca59a',
|
|
||||||
'display_id': 'revive-la-entrevista-completa-a-miguel-bose-en-el-hormiguero',
|
|
||||||
'title': 'Revive la entrevista completa a Miguel Bosé en El Hormiguero',
|
|
||||||
'description': 'md5:c6d2b591408d45a7bc2986dfb938eb72',
|
|
||||||
'channel': 'Antena 3',
|
|
||||||
'duration': 2556,
|
|
||||||
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages02/2025/04/10/9076395F-F1FD-48BE-9F18-540DBA10EBAD/1920x1080.jpg',
|
|
||||||
'tags': ['Entrevista', 'Variedades', 'Humor', 'Entretenimiento', 'Te sigo', 'Buen rollo', 'Cara a cara'],
|
|
||||||
'series': 'El Hormiguero ',
|
|
||||||
'season': 'Temporada 14',
|
|
||||||
'timestamp': 1744320111,
|
|
||||||
'upload_date': '20250410',
|
|
||||||
},
|
},
|
||||||
}, {
|
{
|
||||||
'url': 'https://www.atresplayer.com/flooxer/series/biara-proyecto-lazarus/temporada-1/capitulo-3-supervivientes_67a6038b64ceca00070f4f69/',
|
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/',
|
||||||
'info_dict': {
|
'only_matching': True,
|
||||||
'ext': 'mp4',
|
|
||||||
'id': '67a6038b64ceca00070f4f69',
|
|
||||||
'display_id': 'capitulo-3-supervivientes',
|
|
||||||
'title': 'Capítulo 3: Supervivientes',
|
|
||||||
'description': 'md5:65b231f20302f776c2b0dd24594599a1',
|
|
||||||
'channel': 'Flooxer',
|
|
||||||
'duration': 1196,
|
|
||||||
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages01/2025/02/14/17CF90D3-FE67-40C5-A941-7825B3E13992/1920x1080.jpg',
|
|
||||||
'tags': ['Juvenil', 'Terror', 'Piel de gallina', 'Te sigo', 'Un break', 'Del tirón'],
|
|
||||||
'series': 'BIARA: Proyecto Lázarus',
|
|
||||||
'season': 'Temporada 1',
|
|
||||||
'season_number': 1,
|
|
||||||
'episode': 'Episode 3',
|
|
||||||
'episode_number': 3,
|
|
||||||
'timestamp': 1743095191,
|
|
||||||
'upload_date': '20250327',
|
|
||||||
},
|
},
|
||||||
}, {
|
]
|
||||||
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
_API_BASE = 'https://api.atresplayer.com/'
|
_API_BASE = 'https://api.atresplayer.com/'
|
||||||
|
|
||||||
def _perform_login(self, username, password):
|
def _perform_login(self, username, password):
|
||||||
|
self._request_webpage(
|
||||||
|
self._API_BASE + 'login', None, 'Downloading login page')
|
||||||
|
|
||||||
try:
|
try:
|
||||||
self._download_webpage(
|
target_url = self._download_json(
|
||||||
'https://account.atresplayer.com/auth/v1/login', None,
|
'https://account.atresmedia.com/api/login', None,
|
||||||
'Logging in', 'Failed to log in', data=urlencode_postdata({
|
'Logging in', headers={
|
||||||
|
'Content-Type': 'application/x-www-form-urlencoded',
|
||||||
|
}, data=urlencode_postdata({
|
||||||
'username': username,
|
'username': username,
|
||||||
'password': password,
|
'password': password,
|
||||||
}))
|
}))['targetUrl']
|
||||||
except ExtractorError as e:
|
except ExtractorError as e:
|
||||||
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
|
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
|
||||||
raise ExtractorError('Invalid username and/or password', expected=True)
|
raise ExtractorError('Invalid username and/or password', expected=True)
|
||||||
raise
|
raise
|
||||||
|
|
||||||
|
self._request_webpage(target_url, None, 'Following Target URL')
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
display_id, video_id = self._match_valid_url(url).groups()
|
display_id, video_id = self._match_valid_url(url).groups()
|
||||||
|
|
||||||
metadata_url = self._download_json(
|
|
||||||
self._API_BASE + 'client/v1/url', video_id, 'Downloading API endpoint data',
|
|
||||||
query={'href': urllib.parse.urlparse(url).path})['href']
|
|
||||||
metadata = self._download_json(metadata_url, video_id)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
video_data = self._download_json(metadata['urlVideo'], video_id, 'Downloading video data')
|
episode = self._download_json(
|
||||||
|
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
|
||||||
except ExtractorError as e:
|
except ExtractorError as e:
|
||||||
if isinstance(e.cause, HTTPError) and e.cause.status == 403:
|
if isinstance(e.cause, HTTPError) and e.cause.status == 403:
|
||||||
error = self._parse_json(e.cause.response.read(), None)
|
error = self._parse_json(e.cause.response.read(), None)
|
||||||
@ -108,45 +67,37 @@ class AtresPlayerIE(InfoExtractor):
|
|||||||
raise ExtractorError(error['error_description'], expected=True)
|
raise ExtractorError(error['error_description'], expected=True)
|
||||||
raise
|
raise
|
||||||
|
|
||||||
|
title = episode['titulo']
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
for source in traverse_obj(video_data, ('sources', lambda _, v: url_or_none(v['src']))):
|
for source in episode.get('sources', []):
|
||||||
src_url = source['src']
|
src = source.get('src')
|
||||||
src_type = source.get('type')
|
if not src:
|
||||||
if src_type in ('application/vnd.apple.mpegurl', 'application/hls+legacy', 'application/hls+hevc'):
|
|
||||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
|
||||||
src_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
|
||||||
elif src_type in ('application/dash+xml', 'application/dash+hevc'):
|
|
||||||
fmts, subs = self._extract_mpd_formats_and_subtitles(
|
|
||||||
src_url, video_id, mpd_id='dash', fatal=False)
|
|
||||||
else:
|
|
||||||
continue
|
continue
|
||||||
formats.extend(fmts)
|
src_type = source.get('type')
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
if src_type == 'application/vnd.apple.mpegurl':
|
||||||
|
formats, subtitles = self._extract_m3u8_formats(
|
||||||
|
src, video_id, 'mp4', 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False)
|
||||||
|
elif src_type == 'application/dash+xml':
|
||||||
|
formats, subtitles = self._extract_mpd_formats(
|
||||||
|
src, video_id, mpd_id='dash', fatal=False)
|
||||||
|
|
||||||
|
heartbeat = episode.get('heartbeat') or {}
|
||||||
|
omniture = episode.get('omniture') or {}
|
||||||
|
get_meta = lambda x: heartbeat.get(x) or omniture.get(x)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': episode.get('descripcion'),
|
||||||
|
'thumbnail': episode.get('imgPoster'),
|
||||||
|
'duration': int_or_none(episode.get('duration')),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'channel': get_meta('channel'),
|
||||||
|
'season': get_meta('season'),
|
||||||
|
'episode_number': int_or_none(get_meta('episodeNumber')),
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
**traverse_obj(video_data, {
|
|
||||||
'title': ('titulo', {str}),
|
|
||||||
'description': ('descripcion', {str}),
|
|
||||||
'duration': ('duration', {int_or_none}),
|
|
||||||
'thumbnail': ('imgPoster', {url_or_none}, {lambda v: f'{v}1920x1080.jpg'}),
|
|
||||||
'age_limit': ('ageRating', {parse_age_limit}),
|
|
||||||
}),
|
|
||||||
**traverse_obj(metadata, {
|
|
||||||
'title': ('title', {str}),
|
|
||||||
'description': ('description', {str}),
|
|
||||||
'duration': ('duration', {int_or_none}),
|
|
||||||
'tags': ('tags', ..., 'title', {str}),
|
|
||||||
'age_limit': ('ageRating', {parse_age_limit}),
|
|
||||||
'series': ('format', 'title', {str}),
|
|
||||||
'season': ('currentSeason', 'title', {str}),
|
|
||||||
'season_number': ('currentSeason', 'seasonNumber', {int_or_none}),
|
|
||||||
'episode_number': ('numberOfEpisode', {int_or_none}),
|
|
||||||
'timestamp': ('publicationDate', {int_or_none(scale=1000)}),
|
|
||||||
'channel': ('channel', 'title', {str}),
|
|
||||||
}),
|
|
||||||
}
|
}
|
||||||
|
@ -353,7 +353,7 @@ class CDAIE(InfoExtractor):
|
|||||||
|
|
||||||
class CDAFolderIE(InfoExtractor):
|
class CDAFolderIE(InfoExtractor):
|
||||||
_MAX_PAGE_SIZE = 36
|
_MAX_PAGE_SIZE = 36
|
||||||
_VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>[\w-]+)/folder/(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>\w+)/folder/(?P<id>\d+)'
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
'url': 'https://www.cda.pl/domino264/folder/31188385',
|
'url': 'https://www.cda.pl/domino264/folder/31188385',
|
||||||
@ -378,9 +378,6 @@ class CDAFolderIE(InfoExtractor):
|
|||||||
'title': 'TESTY KOSMETYKÓW',
|
'title': 'TESTY KOSMETYKÓW',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 139,
|
'playlist_mincount': 139,
|
||||||
}, {
|
|
||||||
'url': 'https://www.cda.pl/FILMY-SERIALE-ANIME-KRESKOWKI-BAJKI/folder/18493422',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -1570,8 +1570,6 @@ class InfoExtractor:
|
|||||||
"""Yield all json ld objects in the html"""
|
"""Yield all json ld objects in the html"""
|
||||||
if default is not NO_DEFAULT:
|
if default is not NO_DEFAULT:
|
||||||
fatal = False
|
fatal = False
|
||||||
if not fatal and not isinstance(html, str):
|
|
||||||
return
|
|
||||||
for mobj in re.finditer(JSON_LD_RE, html):
|
for mobj in re.finditer(JSON_LD_RE, html):
|
||||||
json_ld_item = self._parse_json(
|
json_ld_item = self._parse_json(
|
||||||
mobj.group('json_ld'), video_id, fatal=fatal,
|
mobj.group('json_ld'), video_id, fatal=fatal,
|
||||||
|
@ -5,9 +5,7 @@ from ..utils import (
|
|||||||
int_or_none,
|
int_or_none,
|
||||||
try_get,
|
try_get,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
url_or_none,
|
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class CrowdBunkerIE(InfoExtractor):
|
class CrowdBunkerIE(InfoExtractor):
|
||||||
@ -46,15 +44,16 @@ class CrowdBunkerIE(InfoExtractor):
|
|||||||
'url': sub_url,
|
'url': sub_url,
|
||||||
})
|
})
|
||||||
|
|
||||||
if mpd_url := traverse_obj(video_json, ('dashManifest', 'url', {url_or_none})):
|
mpd_url = try_get(video_json, lambda x: x['dashManifest']['url'])
|
||||||
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id, mpd_id='dash', fatal=False)
|
if mpd_url:
|
||||||
|
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id)
|
||||||
formats.extend(fmts)
|
formats.extend(fmts)
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
subtitles = self._merge_subtitles(subtitles, subs)
|
||||||
|
m3u8_url = try_get(video_json, lambda x: x['hlsManifest']['url'])
|
||||||
if m3u8_url := traverse_obj(video_json, ('hlsManifest', 'url', {url_or_none})):
|
if m3u8_url:
|
||||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, m3u8_id='hls', fatal=False)
|
fmts, subs = self._extract_m3u8_formats_and_subtitles(mpd_url, video_id)
|
||||||
formats.extend(fmts)
|
formats.extend(fmts)
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
subtitles = self._merge_subtitles(subtitles, subs)
|
||||||
|
|
||||||
thumbnails = [{
|
thumbnails = [{
|
||||||
'url': image['url'],
|
'url': image['url'],
|
||||||
|
@ -1,87 +0,0 @@
|
|||||||
import urllib.parse
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..networking.exceptions import HTTPError
|
|
||||||
from ..utils import (
|
|
||||||
ExtractorError,
|
|
||||||
float_or_none,
|
|
||||||
url_or_none,
|
|
||||||
)
|
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class FrancaisFacileIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://francaisfacile\.rfi\.fr/[a-z]{2}/(?:actualit%C3%A9|podcasts/[^/#?]+)/(?P<id>[^/#?]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250305-r%C3%A9concilier-les-jeunes-avec-la-lecture-gr%C3%A2ce-aux-r%C3%A9seaux-sociaux',
|
|
||||||
'md5': '4f33674cb205744345cc835991100afa',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'WBMZ58952-FLE-FR-20250305',
|
|
||||||
'display_id': '20250305-réconcilier-les-jeunes-avec-la-lecture-grâce-aux-réseaux-sociaux',
|
|
||||||
'title': 'Réconcilier les jeunes avec la lecture grâce aux réseaux sociaux',
|
|
||||||
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/05/6b6af52a-f9ba-11ef-a1f8-005056a97652.mp3',
|
|
||||||
'ext': 'mp3',
|
|
||||||
'description': 'md5:b903c63d8585bd59e8cc4d5f80c4272d',
|
|
||||||
'duration': 103.15,
|
|
||||||
'timestamp': 1741177984,
|
|
||||||
'upload_date': '20250305',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250307-argentine-le-sac-d-un-alpiniste-retrouv%C3%A9-40-ans-apr%C3%A8s-sa-mort',
|
|
||||||
'md5': 'b8c3a63652d4ae8e8092dda5700c1cd9',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'WBMZ59102-FLE-FR-20250307',
|
|
||||||
'display_id': '20250307-argentine-le-sac-d-un-alpiniste-retrouvé-40-ans-après-sa-mort',
|
|
||||||
'title': 'Argentine: le sac d\'un alpiniste retrouvé 40 ans après sa mort',
|
|
||||||
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/07/8edf4082-fb46-11ef-8a37-005056bf762b.mp3',
|
|
||||||
'ext': 'mp3',
|
|
||||||
'description': 'md5:7fd088fbdf4a943bb68cf82462160dca',
|
|
||||||
'duration': 117.74,
|
|
||||||
'timestamp': 1741352789,
|
|
||||||
'upload_date': '20250307',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://francaisfacile.rfi.fr/fr/podcasts/un-mot-une-histoire/20250317-le-mot-de-david-foenkinos-peut-%C3%AAtre',
|
|
||||||
'md5': 'db83c2cc2589b4c24571c6b6cf14f5f1',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'WBMZ59441-FLE-FR-20250317',
|
|
||||||
'display_id': '20250317-le-mot-de-david-foenkinos-peut-être',
|
|
||||||
'title': 'Le mot de David Foenkinos: «peut-être» - Un mot, une histoire',
|
|
||||||
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/17/4ca6cbbe-0315-11f0-a85b-005056a97652.mp3',
|
|
||||||
'ext': 'mp3',
|
|
||||||
'description': 'md5:3fe35fae035803df696bfa7af2496e49',
|
|
||||||
'duration': 198.96,
|
|
||||||
'timestamp': 1742210897,
|
|
||||||
'upload_date': '20250317',
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
display_id = urllib.parse.unquote(self._match_id(url))
|
|
||||||
|
|
||||||
try: # yt-dlp's default user-agents are too old and blocked by the site
|
|
||||||
webpage = self._download_webpage(url, display_id, headers={
|
|
||||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
|
|
||||||
})
|
|
||||||
except ExtractorError as e:
|
|
||||||
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
|
|
||||||
raise
|
|
||||||
# Retry with impersonation if hardcoded UA is insufficient
|
|
||||||
webpage = self._download_webpage(url, display_id, impersonate=True)
|
|
||||||
|
|
||||||
data = self._search_json(
|
|
||||||
r'<script[^>]+\bdata-media-id=[^>]+\btype="application/json"[^>]*>',
|
|
||||||
webpage, 'audio data', display_id)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': data['mediaId'],
|
|
||||||
'display_id': display_id,
|
|
||||||
'vcodec': 'none',
|
|
||||||
'title': self._html_extract_title(webpage),
|
|
||||||
**self._search_json_ld(webpage, display_id, fatal=False),
|
|
||||||
**traverse_obj(data, {
|
|
||||||
'title': ('title', {str}),
|
|
||||||
'url': ('sources', ..., 'url', {url_or_none}, any),
|
|
||||||
'duration': ('sources', ..., 'duration', {float_or_none}, any),
|
|
||||||
}),
|
|
||||||
}
|
|
@ -2214,21 +2214,10 @@ class GenericIE(InfoExtractor):
|
|||||||
if is_live is not None:
|
if is_live is not None:
|
||||||
info['live_status'] = 'not_live' if is_live == 'false' else 'is_live'
|
info['live_status'] = 'not_live' if is_live == 'false' else 'is_live'
|
||||||
return
|
return
|
||||||
headers = m3u8_format.get('http_headers') or info.get('http_headers') or {}
|
headers = m3u8_format.get('http_headers') or info.get('http_headers')
|
||||||
display_id = info.get('id')
|
duration = self._extract_m3u8_vod_duration(
|
||||||
urlh = self._request_webpage(
|
m3u8_format['url'], info.get('id'), note='Checking m3u8 live status',
|
||||||
m3u8_format['url'], display_id, 'Checking m3u8 live status', errnote=False,
|
errnote='Failed to download m3u8 media playlist', headers=headers)
|
||||||
headers={**headers, 'Accept-Encoding': 'identity'}, fatal=False)
|
|
||||||
if urlh is False:
|
|
||||||
return
|
|
||||||
first_bytes = urlh.read(512)
|
|
||||||
if not first_bytes.startswith(b'#EXTM3U'):
|
|
||||||
return
|
|
||||||
m3u8_doc = self._webpage_read_content(
|
|
||||||
urlh, urlh.url, display_id, prefix=first_bytes, fatal=False, errnote=False)
|
|
||||||
if not m3u8_doc:
|
|
||||||
return
|
|
||||||
duration = self._parse_m3u8_vod_duration(m3u8_doc, display_id)
|
|
||||||
if not duration:
|
if not duration:
|
||||||
info['live_status'] = 'is_live'
|
info['live_status'] = 'is_live'
|
||||||
info['duration'] = info.get('duration') or duration
|
info['duration'] = info.get('duration') or duration
|
||||||
|
@ -1,78 +0,0 @@
|
|||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import int_or_none, parse_iso8601, url_or_none, urljoin
|
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class IvooxIE(InfoExtractor):
|
|
||||||
_VALID_URL = (
|
|
||||||
r'https?://(?:www\.)?ivoox\.com/(?:\w{2}/)?[^/?#]+_rf_(?P<id>[0-9]+)_1\.html',
|
|
||||||
r'https?://go\.ivoox\.com/rf/(?P<id>[0-9]+)',
|
|
||||||
)
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://www.ivoox.com/dex-08x30-rostros-del-mal-los-asesinos-en-audios-mp3_rf_143594959_1.html',
|
|
||||||
'md5': '993f712de5b7d552459fc66aa3726885',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '143594959',
|
|
||||||
'ext': 'mp3',
|
|
||||||
'timestamp': 1742731200,
|
|
||||||
'channel': 'DIAS EXTRAÑOS con Santiago Camacho',
|
|
||||||
'title': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
|
|
||||||
'description': 'md5:eae8b4b9740d0216d3871390b056bb08',
|
|
||||||
'uploader': 'Santiago Camacho',
|
|
||||||
'thumbnail': 'https://static-1.ivoox.com/audios/c/d/5/2/cd52f46783fe735000c33a803dce2554_XXL.jpg',
|
|
||||||
'upload_date': '20250323',
|
|
||||||
'episode': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
|
|
||||||
'duration': 11837,
|
|
||||||
'tags': ['españa', 'asesinos en serie', 'arropiero', 'historia criminal', 'mataviejas'],
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://go.ivoox.com/rf/143594959',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.ivoox.com/en/campodelgas-28-03-2025-audios-mp3_rf_144036942_1.html',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
media_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, media_id, fatal=False)
|
|
||||||
|
|
||||||
data = self._search_nuxt_data(
|
|
||||||
webpage, media_id, fatal=False, traverse=('data', 0, 'data', 'audio'))
|
|
||||||
|
|
||||||
direct_download = self._download_json(
|
|
||||||
f'https://vcore-web.ivoox.com/v1/public/audios/{media_id}/download-url', media_id, fatal=False,
|
|
||||||
note='Fetching direct download link', headers={'Referer': url})
|
|
||||||
|
|
||||||
download_paths = {
|
|
||||||
*traverse_obj(direct_download, ('data', 'downloadUrl', {str}, filter, all)),
|
|
||||||
*traverse_obj(data, (('downloadUrl', 'mediaUrl'), {str}, filter)),
|
|
||||||
}
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
for path in download_paths:
|
|
||||||
formats.append({
|
|
||||||
'url': urljoin('https://ivoox.com', path),
|
|
||||||
'http_headers': {'Referer': url},
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': media_id,
|
|
||||||
'formats': formats,
|
|
||||||
'uploader': self._html_search_regex(r'data-prm-author="([^"]+)"', webpage, 'author', default=None),
|
|
||||||
'timestamp': parse_iso8601(
|
|
||||||
self._html_search_regex(r'data-prm-pubdate="([^"]+)"', webpage, 'timestamp', default=None)),
|
|
||||||
'channel': self._html_search_regex(r'data-prm-podname="([^"]+)"', webpage, 'channel', default=None),
|
|
||||||
'title': self._html_search_regex(r'data-prm-title="([^"]+)"', webpage, 'title', default=None),
|
|
||||||
'thumbnail': self._og_search_thumbnail(webpage, default=None),
|
|
||||||
'description': self._og_search_description(webpage, default=None),
|
|
||||||
**self._search_json_ld(webpage, media_id, default={}),
|
|
||||||
**traverse_obj(data, {
|
|
||||||
'title': ('title', {str}),
|
|
||||||
'description': ('description', {str}),
|
|
||||||
'thumbnail': ('image', {url_or_none}),
|
|
||||||
'timestamp': ('uploadDate', {parse_iso8601(delimiter=' ')}),
|
|
||||||
'duration': ('duration', {int_or_none}),
|
|
||||||
'tags': ('tags', ..., 'name', {str}),
|
|
||||||
}),
|
|
||||||
}
|
|
@ -1,5 +1,3 @@
|
|||||||
import itertools
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
@ -126,43 +124,3 @@ class KikaIE(InfoExtractor):
|
|||||||
'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}),
|
'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}),
|
||||||
}),
|
}),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class KikaPlaylistIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?kika\.de/[\w-]+/(?P<id>[a-z-]+\d+)'
|
|
||||||
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://www.kika.de/logo/logo-die-welt-und-ich-562',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'logo-die-welt-und-ich-562',
|
|
||||||
'title': 'logo!',
|
|
||||||
'description': 'md5:7b9d7f65561b82fa512f2cfb553c397d',
|
|
||||||
},
|
|
||||||
'playlist_count': 100,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _entries(self, playlist_url, playlist_id):
|
|
||||||
for page in itertools.count(1):
|
|
||||||
data = self._download_json(playlist_url, playlist_id, note=f'Downloading page {page}')
|
|
||||||
for item in traverse_obj(data, ('content', lambda _, v: url_or_none(v['api']['url']))):
|
|
||||||
yield self.url_result(
|
|
||||||
item['api']['url'], ie=KikaIE,
|
|
||||||
**traverse_obj(item, {
|
|
||||||
'id': ('id', {str}),
|
|
||||||
'title': ('title', {str}),
|
|
||||||
'duration': ('duration', {int_or_none}),
|
|
||||||
'timestamp': ('date', {parse_iso8601}),
|
|
||||||
}))
|
|
||||||
|
|
||||||
playlist_url = traverse_obj(data, ('links', 'next', {url_or_none}))
|
|
||||||
if not playlist_url:
|
|
||||||
break
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
playlist_id = self._match_id(url)
|
|
||||||
brand_data = self._download_json(
|
|
||||||
f'https://www.kika.de/_next-api/proxy/v1/brands/{playlist_id}', playlist_id)
|
|
||||||
|
|
||||||
return self.playlist_result(
|
|
||||||
self._entries(brand_data['videoSubchannel']['videosPageUrl'], playlist_id),
|
|
||||||
playlist_id, title=brand_data.get('title'), description=brand_data.get('description'))
|
|
||||||
|
@ -82,10 +82,7 @@ class LinkedInLearningBaseIE(LinkedInBaseIE):
|
|||||||
|
|
||||||
|
|
||||||
class LinkedInIE(LinkedInBaseIE):
|
class LinkedInIE(LinkedInBaseIE):
|
||||||
_VALID_URL = [
|
_VALID_URL = r'https?://(?:www\.)?linkedin\.com/posts/[^/?#]+-(?P<id>\d+)-\w{4}/?(?:[?#]|$)'
|
||||||
r'https?://(?:www\.)?linkedin\.com/posts/[^/?#]+-(?P<id>\d+)-\w{4}/?(?:[?#]|$)',
|
|
||||||
r'https?://(?:www\.)?linkedin\.com/feed/update/urn:li:activity:(?P<id>\d+)',
|
|
||||||
]
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.linkedin.com/posts/mishalkhawaja_sendinblueviews-toronto-digitalmarketing-ugcPost-6850898786781339649-mM20',
|
'url': 'https://www.linkedin.com/posts/mishalkhawaja_sendinblueviews-toronto-digitalmarketing-ugcPost-6850898786781339649-mM20',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -109,9 +106,6 @@ class LinkedInIE(LinkedInBaseIE):
|
|||||||
'like_count': int,
|
'like_count': int,
|
||||||
'subtitles': 'mincount:1',
|
'subtitles': 'mincount:1',
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'https://www.linkedin.com/feed/update/urn:li:activity:7016901149999955968/?utm_source=share&utm_medium=member_desktop',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -1,9 +1,5 @@
|
|||||||
import json
|
|
||||||
import random
|
|
||||||
import time
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import int_or_none, jwt_decode_hs256, try_call, url_or_none
|
from ..utils import int_or_none, url_or_none
|
||||||
from ..utils.traversal import require, traverse_obj
|
from ..utils.traversal import require, traverse_obj
|
||||||
|
|
||||||
|
|
||||||
@ -59,81 +55,13 @@ class LocoIE(InfoExtractor):
|
|||||||
'upload_date': '20250226',
|
'upload_date': '20250226',
|
||||||
'modified_date': '20250226',
|
'modified_date': '20250226',
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
# Requires video authorization
|
|
||||||
'url': 'https://loco.com/stream/ac854641-ae0f-497c-a8ea-4195f6d8cc53',
|
|
||||||
'md5': '0513edf85c1e65c9521f555f665387d5',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'ac854641-ae0f-497c-a8ea-4195f6d8cc53',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'DUAS CONTAS DESAFIANTE, RUSH TOP 1 NO BRASIL!',
|
|
||||||
'description': 'md5:aa77818edd6fe00dd4b6be75cba5f826',
|
|
||||||
'uploader_id': '7Y9JNAZC3Q',
|
|
||||||
'channel': 'ayellol',
|
|
||||||
'channel_follower_count': int,
|
|
||||||
'comment_count': int,
|
|
||||||
'view_count': int,
|
|
||||||
'concurrent_view_count': int,
|
|
||||||
'like_count': int,
|
|
||||||
'duration': 1229,
|
|
||||||
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/f5aa678b-6d04-45d9-a89a-859af0a8028f.jpg',
|
|
||||||
'tags': ['Gameplay', 'Carry'],
|
|
||||||
'series': 'League of Legends',
|
|
||||||
'timestamp': 1741182253,
|
|
||||||
'upload_date': '20250305',
|
|
||||||
'modified_timestamp': 1741182419,
|
|
||||||
'modified_date': '20250305',
|
|
||||||
},
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
# From _app.js
|
|
||||||
_CLIENT_ID = 'TlwKp1zmF6eKFpcisn3FyR18WkhcPkZtzwPVEEC3'
|
|
||||||
_CLIENT_SECRET = 'Kp7tYlUN7LXvtcSpwYvIitgYcLparbtsQSe5AdyyCdiEJBP53Vt9J8eB4AsLdChIpcO2BM19RA3HsGtqDJFjWmwoonvMSG3ZQmnS8x1YIM8yl82xMXZGbE3NKiqmgBVU'
|
|
||||||
|
|
||||||
def _is_jwt_expired(self, token):
|
|
||||||
return jwt_decode_hs256(token)['exp'] - time.time() < 300
|
|
||||||
|
|
||||||
def _get_access_token(self, video_id):
|
|
||||||
access_token = try_call(lambda: self._get_cookies('https://loco.com')['access_token'].value)
|
|
||||||
if access_token and not self._is_jwt_expired(access_token):
|
|
||||||
return access_token
|
|
||||||
access_token = traverse_obj(self._download_json(
|
|
||||||
'https://api.getloconow.com/v3/user/device_profile/', video_id,
|
|
||||||
'Downloading access token', fatal=False, data=json.dumps({
|
|
||||||
'platform': 7,
|
|
||||||
'client_id': self._CLIENT_ID,
|
|
||||||
'client_secret': self._CLIENT_SECRET,
|
|
||||||
'model': 'Mozilla',
|
|
||||||
'os_name': 'Win32',
|
|
||||||
'os_ver': '5.0 (Windows)',
|
|
||||||
'app_ver': '5.0 (Windows)',
|
|
||||||
}).encode(), headers={
|
|
||||||
'Content-Type': 'application/json;charset=utf-8',
|
|
||||||
'DEVICE-ID': ''.join(random.choices('0123456789abcdef', k=32)) + 'live',
|
|
||||||
'X-APP-LANG': 'en',
|
|
||||||
'X-APP-LOCALE': 'en-US',
|
|
||||||
'X-CLIENT-ID': self._CLIENT_ID,
|
|
||||||
'X-CLIENT-SECRET': self._CLIENT_SECRET,
|
|
||||||
'X-PLATFORM': '7',
|
|
||||||
}), 'access_token')
|
|
||||||
if access_token and not self._is_jwt_expired(access_token):
|
|
||||||
self._set_cookie('.loco.com', 'access_token', access_token)
|
|
||||||
return access_token
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_type, video_id = self._match_valid_url(url).group('type', 'id')
|
video_type, video_id = self._match_valid_url(url).group('type', 'id')
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
stream = traverse_obj(self._search_nextjs_data(webpage, video_id), (
|
stream = traverse_obj(self._search_nextjs_data(webpage, video_id), (
|
||||||
'props', 'pageProps', ('liveStreamData', 'stream', 'liveStream'), {dict}, any, {require('stream info')}))
|
'props', 'pageProps', ('liveStreamData', 'stream'), {dict}, any, {require('stream info')}))
|
||||||
|
|
||||||
if access_token := self._get_access_token(video_id):
|
|
||||||
self._request_webpage(
|
|
||||||
'https://drm.loco.com/v1/streams/playback/', video_id,
|
|
||||||
'Downloading video authorization', fatal=False, headers={
|
|
||||||
'authorization': access_token,
|
|
||||||
}, query={
|
|
||||||
'stream_uid': stream['uid'],
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id),
|
'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id),
|
||||||
|
@ -2,11 +2,8 @@ from .common import InfoExtractor
|
|||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
clean_html,
|
||||||
merge_dicts,
|
merge_dicts,
|
||||||
str_or_none,
|
|
||||||
traverse_obj,
|
traverse_obj,
|
||||||
unified_timestamp,
|
|
||||||
url_or_none,
|
url_or_none,
|
||||||
urljoin,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -83,7 +80,7 @@ class LRTVODIE(LRTBaseIE):
|
|||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
path, video_id = self._match_valid_url(url).group('path', 'id')
|
path, video_id = self._match_valid_url(url).groups()
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
media_url = self._extract_js_var(webpage, 'main_url', path)
|
media_url = self._extract_js_var(webpage, 'main_url', path)
|
||||||
@ -109,42 +106,3 @@ class LRTVODIE(LRTBaseIE):
|
|||||||
}
|
}
|
||||||
|
|
||||||
return merge_dicts(clean_info, jw_data, json_ld_data)
|
return merge_dicts(clean_info, jw_data, json_ld_data)
|
||||||
|
|
||||||
|
|
||||||
class LRTRadioIE(LRTBaseIE):
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?lrt\.lt/radioteka/irasas/(?P<id>\d+)/(?P<path>[^?#/]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
# m3u8 download
|
|
||||||
'url': 'https://www.lrt.lt/radioteka/irasas/2000359728/nemarios-eiles-apie-pragarus-ir-skaistyklas-su-aiste-kiltinaviciute',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '2000359728',
|
|
||||||
'ext': 'm4a',
|
|
||||||
'title': 'Nemarios eilės: apie pragarus ir skaistyklas su Aiste Kiltinavičiūte',
|
|
||||||
'description': 'md5:5eee9a0e86a55bf547bd67596204625d',
|
|
||||||
'timestamp': 1726143120,
|
|
||||||
'upload_date': '20240912',
|
|
||||||
'tags': 'count:5',
|
|
||||||
'thumbnail': r're:https?://.+/.+\.jpe?g',
|
|
||||||
'categories': ['Daiktiniai įrodymai'],
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.lrt.lt/radioteka/irasas/2000304654/vakaras-su-knyga-svetlana-aleksijevic-cernobylio-malda-v-dalis?season=%2Fmediateka%2Faudio%2Fvakaras-su-knyga%2F2023',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id, path = self._match_valid_url(url).group('id', 'path')
|
|
||||||
media = self._download_json(
|
|
||||||
'https://www.lrt.lt/radioteka/api/media', video_id,
|
|
||||||
query={'url': f'/mediateka/irasas/{video_id}/{path}'})
|
|
||||||
|
|
||||||
return traverse_obj(media, {
|
|
||||||
'id': ('id', {int}, {str_or_none}),
|
|
||||||
'title': ('title', {str}),
|
|
||||||
'tags': ('tags', ..., 'name', {str}),
|
|
||||||
'categories': ('playlist_item', 'category', {str}, filter, all, filter),
|
|
||||||
'description': ('content', {clean_html}, {str}),
|
|
||||||
'timestamp': ('date', {lambda x: x.replace('.', '/')}, {unified_timestamp}),
|
|
||||||
'thumbnail': ('playlist_item', 'image', {urljoin('https://www.lrt.lt')}),
|
|
||||||
'formats': ('playlist_item', 'file', {lambda x: self._extract_m3u8_formats(x, video_id)}),
|
|
||||||
})
|
|
||||||
|
@ -1,38 +1,31 @@
|
|||||||
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
clean_html,
|
|
||||||
determine_ext,
|
determine_ext,
|
||||||
|
extract_attributes,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
join_nonempty,
|
str_to_int,
|
||||||
parse_count,
|
|
||||||
parse_duration,
|
|
||||||
parse_iso8601,
|
|
||||||
url_or_none,
|
url_or_none,
|
||||||
|
urlencode_postdata,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class ManyVidsIE(InfoExtractor):
|
class ManyVidsIE(InfoExtractor):
|
||||||
|
_WORKING = False
|
||||||
_VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)'
|
_VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# preview video
|
# preview video
|
||||||
'url': 'https://www.manyvids.com/Video/530341/mv-tips-tricks',
|
'url': 'https://www.manyvids.com/Video/133957/everthing-about-me/',
|
||||||
'md5': '738dc723f7735ee9602f7ea352a6d058',
|
'md5': '03f11bb21c52dd12a05be21a5c7dcc97',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '530341-preview',
|
'id': '133957',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'MV Tips & Tricks (Preview)',
|
'title': 'everthing about me (Preview)',
|
||||||
'description': r're:I will take you on a tour around .{1313}$',
|
'uploader': 'ellyxxix',
|
||||||
'thumbnail': r're:https://cdn5\.manyvids\.com/php_uploads/video_images/DestinyDiaz/.+\.jpg',
|
|
||||||
'uploader': 'DestinyDiaz',
|
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
'release_timestamp': 1508419904,
|
|
||||||
'tags': ['AdultSchool', 'BBW', 'SFW', 'TeacherFetish'],
|
|
||||||
'release_date': '20171019',
|
|
||||||
'duration': 3167.0,
|
|
||||||
},
|
},
|
||||||
'expected_warnings': ['Only extracting preview'],
|
|
||||||
}, {
|
}, {
|
||||||
# full video
|
# full video
|
||||||
'url': 'https://www.manyvids.com/Video/935718/MY-FACE-REVEAL/',
|
'url': 'https://www.manyvids.com/Video/935718/MY-FACE-REVEAL/',
|
||||||
@ -41,68 +34,129 @@ class ManyVidsIE(InfoExtractor):
|
|||||||
'id': '935718',
|
'id': '935718',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'MY FACE REVEAL',
|
'title': 'MY FACE REVEAL',
|
||||||
'description': r're:Today is the day!! I am finally taking off my mask .{445}$',
|
'description': 'md5:ec5901d41808b3746fed90face161612',
|
||||||
'thumbnail': r're:https://ods\.manyvids\.com/1001061960/3aa5397f2a723ec4597e344df66ab845/screenshots/.+\.jpg',
|
|
||||||
'uploader': 'Sarah Calanthe',
|
'uploader': 'Sarah Calanthe',
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
'release_date': '20181110',
|
|
||||||
'tags': ['EyeContact', 'Interviews', 'MaskFetish', 'MouthFetish', 'Redhead'],
|
|
||||||
'release_timestamp': 1541851200,
|
|
||||||
'duration': 224.0,
|
|
||||||
},
|
},
|
||||||
}]
|
}]
|
||||||
_API_BASE = 'https://www.manyvids.com/bff/store/video'
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
video_data = self._download_json(f'{self._API_BASE}/{video_id}/private', video_id)['data']
|
|
||||||
formats, preview_only = [], True
|
|
||||||
|
|
||||||
for format_id, path in [
|
real_url = f'https://www.manyvids.com/video/{video_id}/gtm.js'
|
||||||
('preview', ['teaser', 'filepath']),
|
try:
|
||||||
('transcoded', ['transcodedFilepath']),
|
webpage = self._download_webpage(real_url, video_id)
|
||||||
('filepath', ['filepath']),
|
except Exception:
|
||||||
]:
|
# probably useless fallback
|
||||||
format_url = traverse_obj(video_data, (*path, {url_or_none}))
|
webpage = self._download_webpage(url, video_id)
|
||||||
if not format_url:
|
|
||||||
|
info = self._search_regex(
|
||||||
|
r'''(<div\b[^>]*\bid\s*=\s*(['"])pageMetaDetails\2[^>]*>)''',
|
||||||
|
webpage, 'meta details', default='')
|
||||||
|
info = extract_attributes(info)
|
||||||
|
|
||||||
|
player = self._search_regex(
|
||||||
|
r'''(<div\b[^>]*\bid\s*=\s*(['"])rmpPlayerStream\2[^>]*>)''',
|
||||||
|
webpage, 'player details', default='')
|
||||||
|
player = extract_attributes(player)
|
||||||
|
|
||||||
|
video_urls_and_ids = (
|
||||||
|
(info.get('data-meta-video'), 'video'),
|
||||||
|
(player.get('data-video-transcoded'), 'transcoded'),
|
||||||
|
(player.get('data-video-filepath'), 'filepath'),
|
||||||
|
(self._og_search_video_url(webpage, secure=False, default=None), 'og_video'),
|
||||||
|
)
|
||||||
|
|
||||||
|
def txt_or_none(s, default=None):
|
||||||
|
return (s.strip() or default) if isinstance(s, str) else default
|
||||||
|
|
||||||
|
uploader = txt_or_none(info.get('data-meta-author'))
|
||||||
|
|
||||||
|
def mung_title(s):
|
||||||
|
if uploader:
|
||||||
|
s = re.sub(rf'^\s*{re.escape(uploader)}\s+[|-]', '', s)
|
||||||
|
return txt_or_none(s)
|
||||||
|
|
||||||
|
title = (
|
||||||
|
mung_title(info.get('data-meta-title'))
|
||||||
|
or self._html_search_regex(
|
||||||
|
(r'<span[^>]+class=["\']item-title[^>]+>([^<]+)',
|
||||||
|
r'<h2[^>]+class=["\']h2 m-0["\'][^>]*>([^<]+)'),
|
||||||
|
webpage, 'title', default=None)
|
||||||
|
or self._html_search_meta(
|
||||||
|
'twitter:title', webpage, 'title', fatal=True))
|
||||||
|
|
||||||
|
title = re.sub(r'\s*[|-]\s+ManyVids\s*$', '', title) or title
|
||||||
|
|
||||||
|
if any(p in webpage for p in ('preview_videos', '_preview.mp4')):
|
||||||
|
title += ' (Preview)'
|
||||||
|
|
||||||
|
mv_token = self._search_regex(
|
||||||
|
r'data-mvtoken=(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
|
||||||
|
'mv token', default=None, group='value')
|
||||||
|
|
||||||
|
if mv_token:
|
||||||
|
# Sets some cookies
|
||||||
|
self._download_webpage(
|
||||||
|
'https://www.manyvids.com/includes/ajax_repository/you_had_me_at_hello.php',
|
||||||
|
video_id, note='Setting format cookies', fatal=False,
|
||||||
|
data=urlencode_postdata({
|
||||||
|
'mvtoken': mv_token,
|
||||||
|
'vid': video_id,
|
||||||
|
}), headers={
|
||||||
|
'Referer': url,
|
||||||
|
'X-Requested-With': 'XMLHttpRequest',
|
||||||
|
})
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
for v_url, fmt in video_urls_and_ids:
|
||||||
|
v_url = url_or_none(v_url)
|
||||||
|
if not v_url:
|
||||||
continue
|
continue
|
||||||
if determine_ext(format_url) == 'm3u8':
|
if determine_ext(v_url) == 'm3u8':
|
||||||
formats.extend(self._extract_m3u8_formats(format_url, video_id, 'mp4', m3u8_id=format_id))
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
v_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||||
|
m3u8_id='hls'))
|
||||||
else:
|
else:
|
||||||
formats.append({
|
formats.append({
|
||||||
'url': format_url,
|
'url': v_url,
|
||||||
'format_id': format_id,
|
'format_id': fmt,
|
||||||
'preference': -10 if format_id == 'preview' else None,
|
|
||||||
'quality': 10 if format_id == 'filepath' else None,
|
|
||||||
'height': int_or_none(
|
|
||||||
self._search_regex(r'_(\d{2,3}[02468])_', format_url, 'height', default=None)),
|
|
||||||
})
|
})
|
||||||
if format_id != 'preview':
|
|
||||||
preview_only = False
|
|
||||||
|
|
||||||
metadata = traverse_obj(
|
self._remove_duplicate_formats(formats)
|
||||||
self._download_json(f'{self._API_BASE}/{video_id}', video_id, fatal=False), 'data')
|
|
||||||
title = traverse_obj(metadata, ('title', {clean_html}))
|
|
||||||
|
|
||||||
if preview_only:
|
for f in formats:
|
||||||
title = join_nonempty(title, '(Preview)', delim=' ')
|
if f.get('height') is None:
|
||||||
video_id += '-preview'
|
f['height'] = int_or_none(
|
||||||
self.report_warning(
|
self._search_regex(r'_(\d{2,3}[02468])_', f['url'], 'video height', default=None))
|
||||||
f'Only extracting preview. Video may be paid or subscription only. {self._login_hint()}')
|
if '/preview/' in f['url']:
|
||||||
|
f['format_id'] = '_'.join(filter(None, (f.get('format_id'), 'preview')))
|
||||||
|
f['preference'] = -10
|
||||||
|
if 'transcoded' in f['format_id']:
|
||||||
|
f['preference'] = f.get('preference', -1) - 1
|
||||||
|
|
||||||
|
def get_likes():
|
||||||
|
likes = self._search_regex(
|
||||||
|
rf'''(<a\b[^>]*\bdata-id\s*=\s*(['"]){video_id}\2[^>]*>)''',
|
||||||
|
webpage, 'likes', default='')
|
||||||
|
likes = extract_attributes(likes)
|
||||||
|
return int_or_none(likes.get('data-likes'))
|
||||||
|
|
||||||
|
def get_views():
|
||||||
|
return str_to_int(self._html_search_regex(
|
||||||
|
r'''(?s)<span\b[^>]*\bclass\s*=["']views-wrapper\b[^>]+>.+?<span\b[^>]+>\s*(\d[\d,.]*)\s*</span>''',
|
||||||
|
webpage, 'view count', default=None))
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
**traverse_obj(metadata, {
|
'description': txt_or_none(info.get('data-meta-description')),
|
||||||
'description': ('description', {clean_html}),
|
'uploader': txt_or_none(info.get('data-meta-author')),
|
||||||
'uploader': ('model', 'displayName', {clean_html}),
|
'thumbnail': (
|
||||||
'thumbnail': (('screenshot', 'thumbnail'), {url_or_none}, any),
|
url_or_none(info.get('data-meta-image'))
|
||||||
'view_count': ('views', {parse_count}),
|
or url_or_none(player.get('data-video-screenshot'))),
|
||||||
'like_count': ('likes', {parse_count}),
|
'view_count': get_views(),
|
||||||
'release_timestamp': ('launchDate', {parse_iso8601}),
|
'like_count': get_likes(),
|
||||||
'duration': ('videoDuration', {parse_duration}),
|
|
||||||
'tags': ('tagList', ..., 'label', {str}, filter, all, filter),
|
|
||||||
}),
|
|
||||||
}
|
}
|
||||||
|
@ -4,7 +4,6 @@ from .common import InfoExtractor
|
|||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
parse_resolution,
|
|
||||||
traverse_obj,
|
traverse_obj,
|
||||||
unified_timestamp,
|
unified_timestamp,
|
||||||
url_basename,
|
url_basename,
|
||||||
@ -84,8 +83,8 @@ class MicrosoftMediusBaseIE(InfoExtractor):
|
|||||||
subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub)
|
subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub)
|
||||||
return subtitles
|
return subtitles
|
||||||
|
|
||||||
def _extract_ism(self, ism_url, video_id, fatal=True):
|
def _extract_ism(self, ism_url, video_id):
|
||||||
formats = self._extract_ism_formats(ism_url, video_id, fatal=fatal)
|
formats = self._extract_ism_formats(ism_url, video_id)
|
||||||
for fmt in formats:
|
for fmt in formats:
|
||||||
if fmt['language'] != 'eng' and 'English' not in fmt['format_id']:
|
if fmt['language'] != 'eng' and 'English' not in fmt['format_id']:
|
||||||
fmt['language_preference'] = -10
|
fmt['language_preference'] = -10
|
||||||
@ -219,21 +218,9 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
|
|||||||
'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88',
|
'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88',
|
||||||
'timestamp': 1676339547,
|
'timestamp': 1676339547,
|
||||||
'upload_date': '20230214',
|
'upload_date': '20230214',
|
||||||
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.+\.png',
|
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.*\.png',
|
||||||
'subtitles': 'count:14',
|
'subtitles': 'count:14',
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'https://learn.microsoft.com/en-gb/shows/on-demand-instructor-led-training-series/az-900-module-1',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '4fe10f7c-d83c-463b-ac0e-c30a8195e01b',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'AZ-900 Cloud fundamentals (1 of 6)',
|
|
||||||
'description': 'md5:3c2212ce865e9142f402c766441bd5c9',
|
|
||||||
'thumbnail': r're:https://.+/.+\.jpg',
|
|
||||||
'timestamp': 1706605184,
|
|
||||||
'upload_date': '20240130',
|
|
||||||
},
|
|
||||||
'params': {'format': 'bv[protocol=https]'},
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -243,32 +230,9 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
|
|||||||
entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True)
|
entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True)
|
||||||
video_info = self._download_json(
|
video_info = self._download_json(
|
||||||
f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id)
|
f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id)
|
||||||
|
|
||||||
formats = []
|
|
||||||
if ism_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoUrl', {url_or_none})):
|
|
||||||
formats.extend(self._extract_ism(ism_url, video_id, fatal=False))
|
|
||||||
if hls_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoHLSUrl', {url_or_none})):
|
|
||||||
formats.extend(self._extract_m3u8_formats(hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
|
|
||||||
if mpd_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoDashUrl', {url_or_none})):
|
|
||||||
formats.extend(self._extract_mpd_formats(mpd_url, video_id, mpd_id='dash', fatal=False))
|
|
||||||
for key in ('low', 'medium', 'high'):
|
|
||||||
if video_url := traverse_obj(video_info, ('publicVideo', f'{key}QualityVideoUrl', {url_or_none})):
|
|
||||||
formats.append({
|
|
||||||
'url': video_url,
|
|
||||||
'format_id': f'video-http-{key}',
|
|
||||||
'acodec': 'none',
|
|
||||||
**parse_resolution(video_url),
|
|
||||||
})
|
|
||||||
if audio_url := traverse_obj(video_info, ('publicVideo', 'audioUrl', {url_or_none})):
|
|
||||||
formats.append({
|
|
||||||
'url': audio_url,
|
|
||||||
'format_id': 'audio-http',
|
|
||||||
'vcodec': 'none',
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': entry_id,
|
'id': entry_id,
|
||||||
'formats': formats,
|
'formats': self._extract_ism(video_info['publicVideo']['adaptiveVideoUrl'], video_id),
|
||||||
'subtitles': self._sub_to_dict(traverse_obj(video_info, (
|
'subtitles': self._sub_to_dict(traverse_obj(video_info, (
|
||||||
'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), {
|
'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), {
|
||||||
'tag': ('language', {str}),
|
'tag': ('language', {str}),
|
||||||
|
@ -10,9 +10,7 @@ from ..utils import (
|
|||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
strip_or_none,
|
strip_or_none,
|
||||||
try_get,
|
try_get,
|
||||||
url_or_none,
|
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class MixcloudBaseIE(InfoExtractor):
|
class MixcloudBaseIE(InfoExtractor):
|
||||||
@ -39,7 +37,7 @@ class MixcloudIE(MixcloudBaseIE):
|
|||||||
'ext': 'm4a',
|
'ext': 'm4a',
|
||||||
'title': 'Cryptkeeper',
|
'title': 'Cryptkeeper',
|
||||||
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
|
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
|
||||||
'uploader': 'dholbach',
|
'uploader': 'Daniel Holbach',
|
||||||
'uploader_id': 'dholbach',
|
'uploader_id': 'dholbach',
|
||||||
'thumbnail': r're:https?://.*\.jpg',
|
'thumbnail': r're:https?://.*\.jpg',
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
@ -48,11 +46,10 @@ class MixcloudIE(MixcloudBaseIE):
|
|||||||
'uploader_url': 'https://www.mixcloud.com/dholbach/',
|
'uploader_url': 'https://www.mixcloud.com/dholbach/',
|
||||||
'artist': 'Submorphics & Chino , Telekinesis, Porter Robinson, Enei, Breakage ft Jess Mills',
|
'artist': 'Submorphics & Chino , Telekinesis, Porter Robinson, Enei, Breakage ft Jess Mills',
|
||||||
'duration': 3723,
|
'duration': 3723,
|
||||||
'tags': ['liquid drum and bass', 'drum and bass'],
|
'tags': [],
|
||||||
'comment_count': int,
|
'comment_count': int,
|
||||||
'repost_count': int,
|
'repost_count': int,
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
'artists': list,
|
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'm3u8'},
|
'params': {'skip_download': 'm3u8'},
|
||||||
}, {
|
}, {
|
||||||
@ -70,7 +67,7 @@ class MixcloudIE(MixcloudBaseIE):
|
|||||||
'upload_date': '20150203',
|
'upload_date': '20150203',
|
||||||
'uploader_url': 'https://www.mixcloud.com/gillespeterson/',
|
'uploader_url': 'https://www.mixcloud.com/gillespeterson/',
|
||||||
'duration': 2992,
|
'duration': 2992,
|
||||||
'tags': ['jazz', 'soul', 'world music', 'funk'],
|
'tags': [],
|
||||||
'comment_count': int,
|
'comment_count': int,
|
||||||
'repost_count': int,
|
'repost_count': int,
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
@ -152,6 +149,8 @@ class MixcloudIE(MixcloudBaseIE):
|
|||||||
elif reason:
|
elif reason:
|
||||||
raise ExtractorError('Track is restricted', expected=True)
|
raise ExtractorError('Track is restricted', expected=True)
|
||||||
|
|
||||||
|
title = cloudcast['name']
|
||||||
|
|
||||||
stream_info = cloudcast['streamInfo']
|
stream_info = cloudcast['streamInfo']
|
||||||
formats = []
|
formats = []
|
||||||
|
|
||||||
@ -183,39 +182,47 @@ class MixcloudIE(MixcloudBaseIE):
|
|||||||
self.raise_login_required(metadata_available=True)
|
self.raise_login_required(metadata_available=True)
|
||||||
|
|
||||||
comments = []
|
comments = []
|
||||||
for node in traverse_obj(cloudcast, ('comments', 'edges', ..., 'node', {dict})):
|
for edge in (try_get(cloudcast, lambda x: x['comments']['edges']) or []):
|
||||||
|
node = edge.get('node') or {}
|
||||||
text = strip_or_none(node.get('comment'))
|
text = strip_or_none(node.get('comment'))
|
||||||
if not text:
|
if not text:
|
||||||
continue
|
continue
|
||||||
|
user = node.get('user') or {}
|
||||||
comments.append({
|
comments.append({
|
||||||
|
'author': user.get('displayName'),
|
||||||
|
'author_id': user.get('username'),
|
||||||
'text': text,
|
'text': text,
|
||||||
**traverse_obj(node, {
|
'timestamp': parse_iso8601(node.get('created')),
|
||||||
'author': ('user', 'displayName', {str}),
|
|
||||||
'author_id': ('user', 'username', {str}),
|
|
||||||
'timestamp': ('created', {parse_iso8601}),
|
|
||||||
}),
|
|
||||||
})
|
})
|
||||||
|
|
||||||
|
tags = []
|
||||||
|
for t in cloudcast.get('tags'):
|
||||||
|
tag = try_get(t, lambda x: x['tag']['name'], str)
|
||||||
|
if not tag:
|
||||||
|
tags.append(tag)
|
||||||
|
|
||||||
|
get_count = lambda x: int_or_none(try_get(cloudcast, lambda y: y[x]['totalCount']))
|
||||||
|
|
||||||
|
owner = cloudcast.get('owner') or {}
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': track_id,
|
'id': track_id,
|
||||||
|
'title': title,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'description': cloudcast.get('description'),
|
||||||
|
'thumbnail': try_get(cloudcast, lambda x: x['picture']['url'], str),
|
||||||
|
'uploader': owner.get('displayName'),
|
||||||
|
'timestamp': parse_iso8601(cloudcast.get('publishDate')),
|
||||||
|
'uploader_id': owner.get('username'),
|
||||||
|
'uploader_url': owner.get('url'),
|
||||||
|
'duration': int_or_none(cloudcast.get('audioLength')),
|
||||||
|
'view_count': int_or_none(cloudcast.get('plays')),
|
||||||
|
'like_count': get_count('favorites'),
|
||||||
|
'repost_count': get_count('reposts'),
|
||||||
|
'comment_count': get_count('comments'),
|
||||||
'comments': comments,
|
'comments': comments,
|
||||||
**traverse_obj(cloudcast, {
|
'tags': tags,
|
||||||
'title': ('name', {str}),
|
'artist': ', '.join(cloudcast.get('featuringArtistList') or []) or None,
|
||||||
'description': ('description', {str}),
|
|
||||||
'thumbnail': ('picture', 'url', {url_or_none}),
|
|
||||||
'timestamp': ('publishDate', {parse_iso8601}),
|
|
||||||
'duration': ('audioLength', {int_or_none}),
|
|
||||||
'uploader': ('owner', 'displayName', {str}),
|
|
||||||
'uploader_id': ('owner', 'username', {str}),
|
|
||||||
'uploader_url': ('owner', 'url', {url_or_none}),
|
|
||||||
'view_count': ('plays', {int_or_none}),
|
|
||||||
'like_count': ('favorites', 'totalCount', {int_or_none}),
|
|
||||||
'repost_count': ('reposts', 'totalCount', {int_or_none}),
|
|
||||||
'comment_count': ('comments', 'totalCount', {int_or_none}),
|
|
||||||
'tags': ('tags', ..., 'tag', 'name', {str}, filter, all, filter),
|
|
||||||
'artists': ('featuringArtistList', ..., {str}, filter, all, filter),
|
|
||||||
}),
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -288,7 +295,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
|||||||
'url': 'http://www.mixcloud.com/dholbach/',
|
'url': 'http://www.mixcloud.com/dholbach/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'dholbach_uploads',
|
'id': 'dholbach_uploads',
|
||||||
'title': 'dholbach (uploads)',
|
'title': 'Daniel Holbach (uploads)',
|
||||||
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 36,
|
'playlist_mincount': 36,
|
||||||
@ -296,7 +303,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
|||||||
'url': 'http://www.mixcloud.com/dholbach/uploads/',
|
'url': 'http://www.mixcloud.com/dholbach/uploads/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'dholbach_uploads',
|
'id': 'dholbach_uploads',
|
||||||
'title': 'dholbach (uploads)',
|
'title': 'Daniel Holbach (uploads)',
|
||||||
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 36,
|
'playlist_mincount': 36,
|
||||||
@ -304,7 +311,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
|||||||
'url': 'http://www.mixcloud.com/dholbach/favorites/',
|
'url': 'http://www.mixcloud.com/dholbach/favorites/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'dholbach_favorites',
|
'id': 'dholbach_favorites',
|
||||||
'title': 'dholbach (favorites)',
|
'title': 'Daniel Holbach (favorites)',
|
||||||
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
||||||
},
|
},
|
||||||
# 'params': {
|
# 'params': {
|
||||||
@ -330,7 +337,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
|||||||
'title': 'First Ear (stream)',
|
'title': 'First Ear (stream)',
|
||||||
'description': 'we maraud for ears',
|
'description': 'we maraud for ears',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 267,
|
'playlist_mincount': 269,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
_TITLE_KEY = 'displayName'
|
_TITLE_KEY = 'displayName'
|
||||||
@ -354,7 +361,7 @@ class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
|
|||||||
'id': 'maxvibes_jazzcat-on-ness-radio',
|
'id': 'maxvibes_jazzcat-on-ness-radio',
|
||||||
'title': 'Ness Radio sessions',
|
'title': 'Ness Radio sessions',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 58,
|
'playlist_mincount': 59,
|
||||||
}]
|
}]
|
||||||
_TITLE_KEY = 'name'
|
_TITLE_KEY = 'name'
|
||||||
_DESCRIPTION_KEY = 'description'
|
_DESCRIPTION_KEY = 'description'
|
||||||
|
@ -449,7 +449,9 @@ mutation initPlaybackSession(
|
|||||||
|
|
||||||
if not (m3u8_url and token):
|
if not (m3u8_url and token):
|
||||||
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
|
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
|
||||||
if errors: # Only warn when 'blacked out' or 'not entitled'; radio formats may be available
|
if 'not entitled' in errors:
|
||||||
|
raise ExtractorError(errors, expected=True)
|
||||||
|
elif errors: # Only warn when 'blacked out' since radio formats are available
|
||||||
self.report_warning(f'API returned errors for {format_id}: {errors}')
|
self.report_warning(f'API returned errors for {format_id}: {errors}')
|
||||||
else:
|
else:
|
||||||
self.report_warning(f'No formats available for {format_id} broadcast; skipping')
|
self.report_warning(f'No formats available for {format_id} broadcast; skipping')
|
||||||
|
@ -27,7 +27,6 @@ from ..utils import (
|
|||||||
traverse_obj,
|
traverse_obj,
|
||||||
try_get,
|
try_get,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
unified_timestamp,
|
|
||||||
update_url_query,
|
update_url_query,
|
||||||
url_basename,
|
url_basename,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
@ -986,7 +985,6 @@ class NiconicoLiveIE(InfoExtractor):
|
|||||||
'quality': 'abr',
|
'quality': 'abr',
|
||||||
'protocol': 'hls+fmp4',
|
'protocol': 'hls+fmp4',
|
||||||
'latency': latency,
|
'latency': latency,
|
||||||
'accessRightMethod': 'single_cookie',
|
|
||||||
'chasePlay': False,
|
'chasePlay': False,
|
||||||
},
|
},
|
||||||
'room': {
|
'room': {
|
||||||
@ -1007,7 +1005,6 @@ class NiconicoLiveIE(InfoExtractor):
|
|||||||
if data.get('type') == 'stream':
|
if data.get('type') == 'stream':
|
||||||
m3u8_url = data['data']['uri']
|
m3u8_url = data['data']['uri']
|
||||||
qualities = data['data']['availableQualities']
|
qualities = data['data']['availableQualities']
|
||||||
cookies = data['data']['cookies']
|
|
||||||
break
|
break
|
||||||
elif data.get('type') == 'disconnect':
|
elif data.get('type') == 'disconnect':
|
||||||
self.write_debug(recv)
|
self.write_debug(recv)
|
||||||
@ -1046,11 +1043,6 @@ class NiconicoLiveIE(InfoExtractor):
|
|||||||
**res,
|
**res,
|
||||||
})
|
})
|
||||||
|
|
||||||
for cookie in cookies:
|
|
||||||
self._set_cookie(
|
|
||||||
cookie['domain'], cookie['name'], cookie['value'],
|
|
||||||
expire_time=unified_timestamp(cookie['expires']), path=cookie['path'], secure=cookie['secure'])
|
|
||||||
|
|
||||||
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
|
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
|
||||||
for fmt, q in zip(formats, reversed(qualities[1:])):
|
for fmt, q in zip(formats, reversed(qualities[1:])):
|
||||||
fmt.update({
|
fmt.update({
|
||||||
|
@ -11,15 +11,12 @@ class On24IE(InfoExtractor):
|
|||||||
IE_NAME = 'on24'
|
IE_NAME = 'on24'
|
||||||
IE_DESC = 'ON24'
|
IE_DESC = 'ON24'
|
||||||
|
|
||||||
_ID_RE = r'(?P<id>\d{7})'
|
_VALID_URL = r'''(?x)
|
||||||
_KEY_RE = r'(?P<key>[0-9A-F]{32})'
|
https?://event\.on24\.com/(?:
|
||||||
_URL_BASE_RE = r'https?://event\.on24\.com'
|
wcc/r/(?P<id_1>\d{7})/(?P<key_1>[0-9A-F]{32})|
|
||||||
_URL_QUERY_RE = rf'(?:[^#]*&)?eventid={_ID_RE}&(?:[^#]+&)?key={_KEY_RE}'
|
eventRegistration/(?:console/EventConsoleApollo|EventLobbyServlet\?target=lobby30)
|
||||||
_VALID_URL = [
|
\.jsp\?(?:[^/#?]*&)?eventid=(?P<id_2>\d{7})[^/#?]*&key=(?P<key_2>[0-9A-F]{32})
|
||||||
rf'{_URL_BASE_RE}/wcc/r/{_ID_RE}/{_KEY_RE}',
|
)'''
|
||||||
rf'{_URL_BASE_RE}/eventRegistration/console/(?:EventConsoleApollo\.jsp|apollox/mainEvent/?)\?{_URL_QUERY_RE}',
|
|
||||||
rf'{_URL_BASE_RE}/eventRegistration/EventLobbyServlet/?\?{_URL_QUERY_RE}',
|
|
||||||
]
|
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false',
|
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false',
|
||||||
@ -37,16 +34,12 @@ class On24IE(InfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch',
|
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://event.on24.com/eventRegistration/EventLobbyServlet?target=reg20.jsp&eventid=3543176&key=BC0F6B968B67C34B50D461D40FDB3E18&groupId=3143628',
|
|
||||||
'only_matching': True,
|
|
||||||
}, {
|
|
||||||
'url': 'https://event.on24.com/eventRegistration/console/apollox/mainEvent?&eventid=4843671&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=4EAC9B5C564CC98FF29E619B06A2F743&newConsole=true&nxChe=true&newTabCon=true&consoleEarEventConsole=false&consoleEarCloudApi=false&text_language_id=en&playerwidth=748&playerheight=526&referrer=https%3A%2F%2Fevent.on24.com%2Finterface%2Fregistration%2Fautoreg%2Findex.html%3Fsessionid%3D1%26eventid%3D4843671%26key%3D4EAC9B5C564CC98FF29E619B06A2F743%26email%3D000a3e42-7952-4dd6-8f8a-34c38ea3cf02%2540platform%26firstname%3Ds%26lastname%3Ds%26deletecookie%3Dtrue%26event_email%3DN%26marketing_email%3DN%26std1%3D0642572014177%26std2%3D0642572014179%26std3%3D550165f7-a44e-4725-9fe6-716f89908c2b%26std4%3D0&eventuserid=745776448&contenttype=A&mediametricsessionid=640613707&mediametricid=6810717&usercd=745776448&mode=launch',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
event_id, event_key = self._match_valid_url(url).group('id', 'key')
|
mobj = self._match_valid_url(url)
|
||||||
|
event_id = mobj.group('id_1') or mobj.group('id_2')
|
||||||
|
event_key = mobj.group('key_1') or mobj.group('key_2')
|
||||||
|
|
||||||
event_data = self._download_json(
|
event_data = self._download_json(
|
||||||
'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet',
|
'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet',
|
||||||
|
@ -14,9 +14,8 @@ from ..utils import (
|
|||||||
int_or_none,
|
int_or_none,
|
||||||
parse_qs,
|
parse_qs,
|
||||||
srt_subtitles_timecode,
|
srt_subtitles_timecode,
|
||||||
url_or_none,
|
traverse_obj,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class PanoptoBaseIE(InfoExtractor):
|
class PanoptoBaseIE(InfoExtractor):
|
||||||
@ -346,16 +345,21 @@ class PanoptoIE(PanoptoBaseIE):
|
|||||||
subtitles = {}
|
subtitles = {}
|
||||||
for stream in streams or []:
|
for stream in streams or []:
|
||||||
stream_formats = []
|
stream_formats = []
|
||||||
for stream_url in set(traverse_obj(stream, (('StreamHttpUrl', 'StreamUrl'), {url_or_none}))):
|
http_stream_url = stream.get('StreamHttpUrl')
|
||||||
|
stream_url = stream.get('StreamUrl')
|
||||||
|
|
||||||
|
if http_stream_url:
|
||||||
|
stream_formats.append({'url': http_stream_url})
|
||||||
|
|
||||||
|
if stream_url:
|
||||||
media_type = stream.get('ViewerMediaFileTypeName')
|
media_type = stream.get('ViewerMediaFileTypeName')
|
||||||
if media_type in ('hls', ):
|
if media_type in ('hls', ):
|
||||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(stream_url, video_id, m3u8_id='hls', fatal=False)
|
m3u8_formats, stream_subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, video_id)
|
||||||
stream_formats.extend(fmts)
|
stream_formats.extend(m3u8_formats)
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
subtitles = self._merge_subtitles(subtitles, stream_subtitles)
|
||||||
else:
|
else:
|
||||||
stream_formats.append({
|
stream_formats.append({
|
||||||
'url': stream_url,
|
'url': stream_url,
|
||||||
'ext': media_type,
|
|
||||||
})
|
})
|
||||||
for fmt in stream_formats:
|
for fmt in stream_formats:
|
||||||
fmt.update({
|
fmt.update({
|
||||||
|
@ -1,101 +0,0 @@
|
|||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import UserNotLive, int_or_none, parse_iso8601, url_or_none, urljoin
|
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class PartiBaseIE(InfoExtractor):
|
|
||||||
def _call_api(self, path, video_id, note=None):
|
|
||||||
return self._download_json(
|
|
||||||
f'https://api-backend.parti.com/parti_v2/profile/{path}', video_id, note)
|
|
||||||
|
|
||||||
|
|
||||||
class PartiVideoIE(PartiBaseIE):
|
|
||||||
IE_NAME = 'parti:video'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?parti\.com/video/(?P<id>\d+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://parti.com/video/66284',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '66284',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'NOW LIVE ',
|
|
||||||
'upload_date': '20250327',
|
|
||||||
'categories': ['Gaming'],
|
|
||||||
'thumbnail': 'https://assets.parti.com/351424_eb9e5250-2821-484a-9c5f-ca99aa666c87.png',
|
|
||||||
'channel': 'ItZTMGG',
|
|
||||||
'timestamp': 1743044379,
|
|
||||||
},
|
|
||||||
'params': {'skip_download': 'm3u8'},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
data = self._call_api(f'get_livestream_channel_info/recent/{video_id}', video_id)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'formats': self._extract_m3u8_formats(
|
|
||||||
urljoin('https://watch.parti.com', data['livestream_recording']), video_id, 'mp4'),
|
|
||||||
**traverse_obj(data, {
|
|
||||||
'title': ('event_title', {str}),
|
|
||||||
'channel': ('user_name', {str}),
|
|
||||||
'thumbnail': ('event_file', {url_or_none}),
|
|
||||||
'categories': ('category_name', {str}, filter, all),
|
|
||||||
'timestamp': ('event_start_ts', {int_or_none}),
|
|
||||||
}),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class PartiLivestreamIE(PartiBaseIE):
|
|
||||||
IE_NAME = 'parti:livestream'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?parti\.com/creator/(?P<service>[\w]+)/(?P<id>[\w/-]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://parti.com/creator/parti/Capt_Robs_Adventures',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'Capt_Robs_Adventures',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': r"re:I'm Live on Parti \d{4}-\d{2}-\d{2} \d{2}:\d{2}",
|
|
||||||
'view_count': int,
|
|
||||||
'thumbnail': r're:https://assets\.parti\.com/.+\.png',
|
|
||||||
'timestamp': 1743879776,
|
|
||||||
'upload_date': '20250405',
|
|
||||||
'live_status': 'is_live',
|
|
||||||
},
|
|
||||||
'params': {'skip_download': 'm3u8'},
|
|
||||||
}, {
|
|
||||||
'url': 'https://parti.com/creator/discord/sazboxgaming/0',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
service, creator_slug = self._match_valid_url(url).group('service', 'id')
|
|
||||||
|
|
||||||
encoded_creator_slug = creator_slug.replace('/', '%23')
|
|
||||||
creator_id = self._call_api(
|
|
||||||
f'get_user_by_social_media/{service}/{encoded_creator_slug}',
|
|
||||||
creator_slug, note='Fetching user ID')
|
|
||||||
|
|
||||||
data = self._call_api(
|
|
||||||
f'get_livestream_channel_info/{creator_id}', creator_id,
|
|
||||||
note='Fetching user profile feed')['channel_info']
|
|
||||||
|
|
||||||
if not traverse_obj(data, ('channel', 'is_live', {bool})):
|
|
||||||
raise UserNotLive(video_id=creator_id)
|
|
||||||
|
|
||||||
channel_info = data['channel']
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': creator_slug,
|
|
||||||
'formats': self._extract_m3u8_formats(
|
|
||||||
channel_info['playback_url'], creator_slug, live=True, query={
|
|
||||||
'token': channel_info['playback_auth_token'],
|
|
||||||
'player_version': '1.17.0',
|
|
||||||
}),
|
|
||||||
'is_live': True,
|
|
||||||
**traverse_obj(data, {
|
|
||||||
'title': ('livestream_event_info', 'event_name', {str}),
|
|
||||||
'description': ('livestream_event_info', 'event_description', {str}),
|
|
||||||
'thumbnail': ('livestream_event_info', 'livestream_preview_file', {url_or_none}),
|
|
||||||
'timestamp': ('stream', 'start_time', {parse_iso8601}),
|
|
||||||
'view_count': ('stream', 'viewer_count', {int_or_none}),
|
|
||||||
}),
|
|
||||||
}
|
|
@ -1,43 +0,0 @@
|
|||||||
from .common import InfoExtractor
|
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class RoyaLiveIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'https?://roya\.tv/live-stream/(?P<id>\d+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://roya.tv/live-stream/1',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '1',
|
|
||||||
'title': r're:Roya TV \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'live_status': 'is_live',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://roya.tv/live-stream/21',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '21',
|
|
||||||
'title': r're:Roya News \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'live_status': 'is_live',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://roya.tv/live-stream/10000',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
media_id = self._match_id(url)
|
|
||||||
|
|
||||||
stream_url = self._download_json(
|
|
||||||
f'https://ticket.roya-tv.com/api/v5/fastchannel/{media_id}', media_id)['data']['secured_url']
|
|
||||||
|
|
||||||
title = traverse_obj(
|
|
||||||
self._download_json('https://backend.roya.tv/api/v01/channels/schedule-pagination', media_id, fatal=False),
|
|
||||||
('data', 0, 'channel', lambda _, v: str(v['id']) == media_id, 'title', {str}, any))
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': media_id,
|
|
||||||
'formats': self._extract_m3u8_formats(stream_url, media_id, 'mp4', m3u8_id='hls', live=True),
|
|
||||||
'title': title,
|
|
||||||
'is_live': True,
|
|
||||||
}
|
|
@ -1,142 +1,35 @@
|
|||||||
import base64
|
import base64
|
||||||
import io
|
import io
|
||||||
import struct
|
import struct
|
||||||
import urllib.parse
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
clean_html,
|
|
||||||
determine_ext,
|
determine_ext,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
make_archive_id,
|
|
||||||
parse_iso8601,
|
|
||||||
qualities,
|
qualities,
|
||||||
url_or_none,
|
remove_end,
|
||||||
|
remove_start,
|
||||||
|
try_get,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import subs_list_to_dict, traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class RTVEBaseIE(InfoExtractor):
|
class RTVEALaCartaIE(InfoExtractor):
|
||||||
# Reimplementation of https://js2.rtve.es/pages/app-player/3.5.1/js/pf_video.js
|
|
||||||
@staticmethod
|
|
||||||
def _decrypt_url(png):
|
|
||||||
encrypted_data = io.BytesIO(base64.b64decode(png)[8:])
|
|
||||||
while True:
|
|
||||||
length_data = encrypted_data.read(4)
|
|
||||||
length = struct.unpack('!I', length_data)[0]
|
|
||||||
chunk_type = encrypted_data.read(4)
|
|
||||||
if chunk_type == b'IEND':
|
|
||||||
break
|
|
||||||
data = encrypted_data.read(length)
|
|
||||||
if chunk_type == b'tEXt':
|
|
||||||
data = bytes(filter(None, data))
|
|
||||||
alphabet_data, _, url_data = data.partition(b'#')
|
|
||||||
quality_str, _, url_data = url_data.rpartition(b'%%')
|
|
||||||
quality_str = quality_str.decode() or ''
|
|
||||||
alphabet = RTVEBaseIE._get_alphabet(alphabet_data)
|
|
||||||
url = RTVEBaseIE._get_url(alphabet, url_data)
|
|
||||||
yield quality_str, url
|
|
||||||
encrypted_data.read(4) # CRC
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _get_url(alphabet, url_data):
|
|
||||||
url = ''
|
|
||||||
f = 0
|
|
||||||
e = 3
|
|
||||||
b = 1
|
|
||||||
for char in url_data.decode('iso-8859-1'):
|
|
||||||
if f == 0:
|
|
||||||
l = int(char) * 10
|
|
||||||
f = 1
|
|
||||||
else:
|
|
||||||
if e == 0:
|
|
||||||
l += int(char)
|
|
||||||
url += alphabet[l]
|
|
||||||
e = (b + 3) % 4
|
|
||||||
f = 0
|
|
||||||
b += 1
|
|
||||||
else:
|
|
||||||
e -= 1
|
|
||||||
return url
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _get_alphabet(alphabet_data):
|
|
||||||
alphabet = []
|
|
||||||
e = 0
|
|
||||||
d = 0
|
|
||||||
for char in alphabet_data.decode('iso-8859-1'):
|
|
||||||
if d == 0:
|
|
||||||
alphabet.append(char)
|
|
||||||
d = e = (e + 1) % 4
|
|
||||||
else:
|
|
||||||
d -= 1
|
|
||||||
return alphabet
|
|
||||||
|
|
||||||
def _extract_png_formats_and_subtitles(self, video_id, media_type='videos'):
|
|
||||||
formats, subtitles = [], {}
|
|
||||||
q = qualities(['Media', 'Alta', 'HQ', 'HD_READY', 'HD_FULL'])
|
|
||||||
for manager in ('rtveplayw', 'default'):
|
|
||||||
png = self._download_webpage(
|
|
||||||
f'http://www.rtve.es/ztnr/movil/thumbnail/{manager}/{media_type}/{video_id}.png',
|
|
||||||
video_id, 'Downloading url information', query={'q': 'v2'}, fatal=False)
|
|
||||||
if not png:
|
|
||||||
continue
|
|
||||||
|
|
||||||
for quality, video_url in self._decrypt_url(png):
|
|
||||||
ext = determine_ext(video_url)
|
|
||||||
if ext == 'm3u8':
|
|
||||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
|
||||||
video_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
|
||||||
formats.extend(fmts)
|
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
|
||||||
elif ext == 'mpd':
|
|
||||||
fmts, subs = self._extract_mpd_formats_and_subtitles(
|
|
||||||
video_url, video_id, 'dash', fatal=False)
|
|
||||||
formats.extend(fmts)
|
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
|
||||||
else:
|
|
||||||
formats.append({
|
|
||||||
'format_id': quality,
|
|
||||||
'quality': q(quality),
|
|
||||||
'url': video_url,
|
|
||||||
})
|
|
||||||
return formats, subtitles
|
|
||||||
|
|
||||||
def _parse_metadata(self, metadata):
|
|
||||||
return traverse_obj(metadata, {
|
|
||||||
'title': ('title', {str.strip}),
|
|
||||||
'alt_title': ('alt', {str.strip}),
|
|
||||||
'description': ('description', {clean_html}),
|
|
||||||
'timestamp': ('dateOfEmission', {parse_iso8601(delimiter=' ')}),
|
|
||||||
'release_timestamp': ('publicationDate', {parse_iso8601(delimiter=' ')}),
|
|
||||||
'modified_timestamp': ('modificationDate', {parse_iso8601(delimiter=' ')}),
|
|
||||||
'thumbnail': (('thumbnail', 'image', 'imageSEO'), {url_or_none}, any),
|
|
||||||
'duration': ('duration', {float_or_none(scale=1000)}),
|
|
||||||
'is_live': ('live', {bool}),
|
|
||||||
'series': (('programTitle', ('programInfo', 'title')), {clean_html}, any),
|
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
class RTVEALaCartaIE(RTVEBaseIE):
|
|
||||||
IE_NAME = 'rtve.es:alacarta'
|
IE_NAME = 'rtve.es:alacarta'
|
||||||
IE_DESC = 'RTVE a la carta and Play'
|
IE_DESC = 'RTVE a la carta'
|
||||||
_VALID_URL = [
|
_VALID_URL = r'https?://(?:www\.)?rtve\.es/(m/)?(alacarta/videos|filmoteca)/[^/]+/[^/]+/(?P<id>\d+)'
|
||||||
r'https?://(?:www\.)?rtve\.es/(?:m/)?(?:(?:alacarta|play)/videos|filmoteca)/(?!directo)(?:[^/?#]+/){2}(?P<id>\d+)',
|
|
||||||
r'https?://(?:www\.)?rtve\.es/infantil/serie/[^/?#]+/video/[^/?#]+/(?P<id>\d+)',
|
|
||||||
]
|
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.rtve.es/alacarta/videos/la-aventura-del-saber/aventuraentornosilla/3088905/',
|
'url': 'http://www.rtve.es/alacarta/videos/balonmano/o-swiss-cup-masculina-final-espana-suecia/2491869/',
|
||||||
'md5': 'a964547824359a5753aef09d79fe984b',
|
'md5': '1d49b7e1ca7a7502c56a4bf1b60f1b43',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '3088905',
|
'id': '2491869',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'En torno a la silla',
|
'title': 'Balonmano - Swiss Cup masculina. Final: España-Suecia',
|
||||||
'duration': 1216.981,
|
'duration': 5024.566,
|
||||||
'series': 'La aventura del Saber',
|
'series': 'Balonmano',
|
||||||
'thumbnail': 'https://img2.rtve.es/v/aventuraentornosilla_3088905.png',
|
|
||||||
},
|
},
|
||||||
|
'expected_warnings': ['Failed to download MPD manifest', 'Failed to download m3u8 information'],
|
||||||
}, {
|
}, {
|
||||||
'note': 'Live stream',
|
'note': 'Live stream',
|
||||||
'url': 'http://www.rtve.es/alacarta/videos/television/24h-live/1694255/',
|
'url': 'http://www.rtve.es/alacarta/videos/television/24h-live/1694255/',
|
||||||
@ -145,88 +38,140 @@ class RTVEALaCartaIE(RTVEBaseIE):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 're:^24H LIVE [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
'title': 're:^24H LIVE [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||||
'is_live': True,
|
'is_live': True,
|
||||||
'live_status': 'is_live',
|
|
||||||
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': 'live stream',
|
'skip_download': 'live stream',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.rtve.es/alacarta/videos/servir-y-proteger/servir-proteger-capitulo-104/4236788/',
|
'url': 'http://www.rtve.es/alacarta/videos/servir-y-proteger/servir-proteger-capitulo-104/4236788/',
|
||||||
'md5': 'f3cf0d1902d008c48c793e736706c174',
|
'md5': 'd850f3c8731ea53952ebab489cf81cbf',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '4236788',
|
'id': '4236788',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Episodio 104',
|
'title': 'Servir y proteger - Capítulo 104',
|
||||||
'duration': 3222.8,
|
'duration': 3222.0,
|
||||||
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
|
|
||||||
'series': 'Servir y proteger',
|
|
||||||
},
|
},
|
||||||
|
'expected_warnings': ['Failed to download MPD manifest', 'Failed to download m3u8 information'],
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.rtve.es/m/alacarta/videos/cuentame-como-paso/cuentame-como-paso-t16-ultimo-minuto-nuestra-vida-capitulo-276/2969138/?media=tve',
|
'url': 'http://www.rtve.es/m/alacarta/videos/cuentame-como-paso/cuentame-como-paso-t16-ultimo-minuto-nuestra-vida-capitulo-276/2969138/?media=tve',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.rtve.es/filmoteca/no-do/not-1-introduccion-primer-noticiario-espanol/1465256/',
|
'url': 'http://www.rtve.es/filmoteca/no-do/not-1-introduccion-primer-noticiario-espanol/1465256/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
|
||||||
'url': 'https://www.rtve.es/play/videos/saber-vivir/07-07-24/16177116/',
|
|
||||||
'md5': 'a5b24fcdfa3ff5cb7908aba53d22d4b6',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '16177116',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Saber vivir - 07/07/24',
|
|
||||||
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
|
|
||||||
'duration': 2162.68,
|
|
||||||
'series': 'Saber vivir',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.rtve.es/infantil/serie/agus-lui-churros-crafts/video/gusano/7048976/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '7048976',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'Gusano',
|
|
||||||
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
|
|
||||||
'duration': 292.86,
|
|
||||||
'series': 'Agus & Lui: Churros y Crafts',
|
|
||||||
'_old_archive_ids': ['rtveinfantil 7048976'],
|
|
||||||
},
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _get_subtitles(self, video_id):
|
def _real_initialize(self):
|
||||||
subtitle_data = self._download_json(
|
user_agent_b64 = base64.b64encode(self.get_param('http_headers')['User-Agent'].encode()).decode('utf-8')
|
||||||
f'https://api2.rtve.es/api/videos/{video_id}/subtitulos.json', video_id,
|
self._manager = self._download_json(
|
||||||
'Downloading subtitles info')
|
'http://www.rtve.es/odin/loki/' + user_agent_b64,
|
||||||
return traverse_obj(subtitle_data, ('page', 'items', ..., {
|
None, 'Fetching manager info')['manager']
|
||||||
'id': ('lang', {str}),
|
|
||||||
'url': ('src', {url_or_none}),
|
@staticmethod
|
||||||
}, all, {subs_list_to_dict(lang='es')}))
|
def _decrypt_url(png):
|
||||||
|
encrypted_data = io.BytesIO(base64.b64decode(png)[8:])
|
||||||
|
while True:
|
||||||
|
length = struct.unpack('!I', encrypted_data.read(4))[0]
|
||||||
|
chunk_type = encrypted_data.read(4)
|
||||||
|
if chunk_type == b'IEND':
|
||||||
|
break
|
||||||
|
data = encrypted_data.read(length)
|
||||||
|
if chunk_type == b'tEXt':
|
||||||
|
alphabet_data, text = data.split(b'\0')
|
||||||
|
quality, url_data = text.split(b'%%')
|
||||||
|
alphabet = []
|
||||||
|
e = 0
|
||||||
|
d = 0
|
||||||
|
for l in alphabet_data.decode('iso-8859-1'):
|
||||||
|
if d == 0:
|
||||||
|
alphabet.append(l)
|
||||||
|
d = e = (e + 1) % 4
|
||||||
|
else:
|
||||||
|
d -= 1
|
||||||
|
url = ''
|
||||||
|
f = 0
|
||||||
|
e = 3
|
||||||
|
b = 1
|
||||||
|
for letter in url_data.decode('iso-8859-1'):
|
||||||
|
if f == 0:
|
||||||
|
l = int(letter) * 10
|
||||||
|
f = 1
|
||||||
|
else:
|
||||||
|
if e == 0:
|
||||||
|
l += int(letter)
|
||||||
|
url += alphabet[l]
|
||||||
|
e = (b + 3) % 4
|
||||||
|
f = 0
|
||||||
|
b += 1
|
||||||
|
else:
|
||||||
|
e -= 1
|
||||||
|
|
||||||
|
yield quality.decode(), url
|
||||||
|
encrypted_data.read(4) # CRC
|
||||||
|
|
||||||
|
def _extract_png_formats(self, video_id):
|
||||||
|
png = self._download_webpage(
|
||||||
|
f'http://www.rtve.es/ztnr/movil/thumbnail/{self._manager}/videos/{video_id}.png',
|
||||||
|
video_id, 'Downloading url information', query={'q': 'v2'})
|
||||||
|
q = qualities(['Media', 'Alta', 'HQ', 'HD_READY', 'HD_FULL'])
|
||||||
|
formats = []
|
||||||
|
for quality, video_url in self._decrypt_url(png):
|
||||||
|
ext = determine_ext(video_url)
|
||||||
|
if ext == 'm3u8':
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
video_url, video_id, 'mp4', 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False))
|
||||||
|
elif ext == 'mpd':
|
||||||
|
formats.extend(self._extract_mpd_formats(
|
||||||
|
video_url, video_id, 'dash', fatal=False))
|
||||||
|
else:
|
||||||
|
formats.append({
|
||||||
|
'format_id': quality,
|
||||||
|
'quality': q(quality),
|
||||||
|
'url': video_url,
|
||||||
|
})
|
||||||
|
return formats
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
metadata = self._download_json(
|
info = self._download_json(
|
||||||
f'http://www.rtve.es/api/videos/{video_id}/config/alacarta_videos.json',
|
f'http://www.rtve.es/api/videos/{video_id}/config/alacarta_videos.json',
|
||||||
video_id)['page']['items'][0]
|
video_id)['page']['items'][0]
|
||||||
if metadata['state'] == 'DESPU':
|
if info['state'] == 'DESPU':
|
||||||
raise ExtractorError('The video is no longer available', expected=True)
|
raise ExtractorError('The video is no longer available', expected=True)
|
||||||
formats, subtitles = self._extract_png_formats_and_subtitles(video_id)
|
title = info['title'].strip()
|
||||||
|
formats = self._extract_png_formats(video_id)
|
||||||
|
|
||||||
self._merge_subtitles(self.extract_subtitles(video_id), target=subtitles)
|
subtitles = None
|
||||||
|
sbt_file = info.get('sbtFile')
|
||||||
|
if sbt_file:
|
||||||
|
subtitles = self.extract_subtitles(video_id, sbt_file)
|
||||||
|
|
||||||
is_infantil = urllib.parse.urlparse(url).path.startswith('/infantil/')
|
is_live = info.get('live') is True
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'thumbnail': info.get('image'),
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
**self._parse_metadata(metadata),
|
'duration': float_or_none(info.get('duration'), 1000),
|
||||||
'_old_archive_ids': [make_archive_id('rtveinfantil', video_id)] if is_infantil else None,
|
'is_live': is_live,
|
||||||
|
'series': info.get('programTitle'),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def _get_subtitles(self, video_id, sub_file):
|
||||||
|
subs = self._download_json(
|
||||||
|
sub_file + '.json', video_id,
|
||||||
|
'Downloading subtitles info')['page']['items']
|
||||||
|
return dict(
|
||||||
|
(s['lang'], [{'ext': 'vtt', 'url': s['src']}])
|
||||||
|
for s in subs)
|
||||||
|
|
||||||
class RTVEAudioIE(RTVEBaseIE):
|
|
||||||
|
class RTVEAudioIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
|
||||||
IE_NAME = 'rtve.es:audio'
|
IE_NAME = 'rtve.es:audio'
|
||||||
IE_DESC = 'RTVE audio'
|
IE_DESC = 'RTVE audio'
|
||||||
_VALID_URL = r'https?://(?:www\.)?rtve\.es/(alacarta|play)/audios/(?:[^/?#]+/){2}(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?rtve\.es/(alacarta|play)/audios/[^/]+/[^/]+/(?P<id>[0-9]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.rtve.es/alacarta/audios/a-hombros-de-gigantes/palabra-ingeniero-codigos-informaticos-27-04-21/5889192/',
|
'url': 'https://www.rtve.es/alacarta/audios/a-hombros-de-gigantes/palabra-ingeniero-codigos-informaticos-27-04-21/5889192/',
|
||||||
@ -235,11 +180,9 @@ class RTVEAudioIE(RTVEBaseIE):
|
|||||||
'id': '5889192',
|
'id': '5889192',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'Códigos informáticos',
|
'title': 'Códigos informáticos',
|
||||||
'alt_title': 'Códigos informáticos - Escuchar ahora',
|
'thumbnail': r're:https?://.+/1598856591583.jpg',
|
||||||
'duration': 349.440,
|
'duration': 349.440,
|
||||||
'series': 'A hombros de gigantes',
|
'series': 'A hombros de gigantes',
|
||||||
'description': 'md5:72b0d7c1ca20fd327bdfff7ac0171afb',
|
|
||||||
'thumbnail': 'https://img2.rtve.es/a/palabra-ingeniero-codigos-informaticos-270421_5889192.png',
|
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.rtve.es/play/audios/en-radio-3/ignatius-farray/5791165/',
|
'url': 'https://www.rtve.es/play/audios/en-radio-3/ignatius-farray/5791165/',
|
||||||
@ -248,11 +191,9 @@ class RTVEAudioIE(RTVEBaseIE):
|
|||||||
'id': '5791165',
|
'id': '5791165',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'Ignatius Farray',
|
'title': 'Ignatius Farray',
|
||||||
'alt_title': 'En Radio 3 - Ignatius Farray - 13/02/21 - escuchar ahora',
|
|
||||||
'thumbnail': r're:https?://.+/1613243011863.jpg',
|
'thumbnail': r're:https?://.+/1613243011863.jpg',
|
||||||
'duration': 3559.559,
|
'duration': 3559.559,
|
||||||
'series': 'En Radio 3',
|
'series': 'En Radio 3',
|
||||||
'description': 'md5:124aa60b461e0b1724a380bad3bc4040',
|
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.rtve.es/play/audios/frankenstein-o-el-moderno-prometeo/capitulo-26-ultimo-muerte-victor-juan-jose-plans-mary-shelley/6082623/',
|
'url': 'https://www.rtve.es/play/audios/frankenstein-o-el-moderno-prometeo/capitulo-26-ultimo-muerte-victor-juan-jose-plans-mary-shelley/6082623/',
|
||||||
@ -261,101 +202,126 @@ class RTVEAudioIE(RTVEBaseIE):
|
|||||||
'id': '6082623',
|
'id': '6082623',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'Capítulo 26 y último: La muerte de Victor',
|
'title': 'Capítulo 26 y último: La muerte de Victor',
|
||||||
'alt_title': 'Frankenstein o el moderno Prometeo - Capítulo 26 y último: La muerte de Victor',
|
|
||||||
'thumbnail': r're:https?://.+/1632147445707.jpg',
|
'thumbnail': r're:https?://.+/1632147445707.jpg',
|
||||||
'duration': 3174.086,
|
'duration': 3174.086,
|
||||||
'series': 'Frankenstein o el moderno Prometeo',
|
'series': 'Frankenstein o el moderno Prometeo',
|
||||||
'description': 'md5:4ee6fcb82ebe2e46d267e1d1c1a8f7b5',
|
|
||||||
},
|
},
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
def _extract_png_formats(self, audio_id):
|
||||||
|
"""
|
||||||
|
This function retrieves media related png thumbnail which obfuscate
|
||||||
|
valuable information about the media. This information is decrypted
|
||||||
|
via base class _decrypt_url function providing media quality and
|
||||||
|
media url
|
||||||
|
"""
|
||||||
|
png = self._download_webpage(
|
||||||
|
f'http://www.rtve.es/ztnr/movil/thumbnail/{self._manager}/audios/{audio_id}.png',
|
||||||
|
audio_id, 'Downloading url information', query={'q': 'v2'})
|
||||||
|
q = qualities(['Media', 'Alta', 'HQ', 'HD_READY', 'HD_FULL'])
|
||||||
|
formats = []
|
||||||
|
for quality, audio_url in self._decrypt_url(png):
|
||||||
|
ext = determine_ext(audio_url)
|
||||||
|
if ext == 'm3u8':
|
||||||
|
formats.extend(self._extract_m3u8_formats(
|
||||||
|
audio_url, audio_id, 'mp4', 'm3u8_native',
|
||||||
|
m3u8_id='hls', fatal=False))
|
||||||
|
elif ext == 'mpd':
|
||||||
|
formats.extend(self._extract_mpd_formats(
|
||||||
|
audio_url, audio_id, 'dash', fatal=False))
|
||||||
|
else:
|
||||||
|
formats.append({
|
||||||
|
'format_id': quality,
|
||||||
|
'quality': q(quality),
|
||||||
|
'url': audio_url,
|
||||||
|
})
|
||||||
|
return formats
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
audio_id = self._match_id(url)
|
audio_id = self._match_id(url)
|
||||||
metadata = self._download_json(
|
info = self._download_json(
|
||||||
f'https://www.rtve.es/api/audios/{audio_id}.json', audio_id)['page']['items'][0]
|
f'https://www.rtve.es/api/audios/{audio_id}.json',
|
||||||
|
audio_id)['page']['items'][0]
|
||||||
formats, subtitles = self._extract_png_formats_and_subtitles(audio_id, media_type='audios')
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': audio_id,
|
'id': audio_id,
|
||||||
'formats': formats,
|
'title': info['title'].strip(),
|
||||||
'subtitles': subtitles,
|
'thumbnail': info.get('thumbnail'),
|
||||||
**self._parse_metadata(metadata),
|
'duration': float_or_none(info.get('duration'), 1000),
|
||||||
|
'series': try_get(info, lambda x: x['programInfo']['title']),
|
||||||
|
'formats': self._extract_png_formats(audio_id),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class RTVELiveIE(RTVEBaseIE):
|
class RTVEInfantilIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
|
||||||
|
IE_NAME = 'rtve.es:infantil'
|
||||||
|
IE_DESC = 'RTVE infantil'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?rtve\.es/infantil/serie/[^/]+/video/[^/]+/(?P<id>[0-9]+)/'
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://www.rtve.es/infantil/serie/cleo/video/maneras-vivir/3040283/',
|
||||||
|
'md5': '5747454717aedf9f9fdf212d1bcfc48d',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3040283',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Maneras de vivir',
|
||||||
|
'thumbnail': r're:https?://.+/1426182947956\.JPG',
|
||||||
|
'duration': 357.958,
|
||||||
|
},
|
||||||
|
'expected_warnings': ['Failed to download MPD manifest', 'Failed to download m3u8 information'],
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class RTVELiveIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
|
||||||
IE_NAME = 'rtve.es:live'
|
IE_NAME = 'rtve.es:live'
|
||||||
IE_DESC = 'RTVE.es live streams'
|
IE_DESC = 'RTVE.es live streams'
|
||||||
_VALID_URL = [
|
_VALID_URL = r'https?://(?:www\.)?rtve\.es/directo/(?P<id>[a-zA-Z0-9-]+)'
|
||||||
r'https?://(?:www\.)?rtve\.es/directo/(?P<id>[a-zA-Z0-9-]+)',
|
|
||||||
r'https?://(?:www\.)?rtve\.es/play/videos/directo/[^/?#]+/(?P<id>[a-zA-Z0-9-]+)',
|
|
||||||
]
|
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.rtve.es/directo/la-1/',
|
'url': 'http://www.rtve.es/directo/la-1/',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'la-1',
|
'id': 'la-1',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'live_status': 'is_live',
|
'title': 're:^La 1 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||||
'title': str,
|
|
||||||
'description': str,
|
|
||||||
'thumbnail': r're:https://img\d\.rtve\.es/resources/thumbslive/\d+\.jpg',
|
|
||||||
'timestamp': int,
|
|
||||||
'upload_date': str,
|
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'live stream'},
|
'params': {
|
||||||
}, {
|
'skip_download': 'live stream',
|
||||||
'url': 'https://www.rtve.es/play/videos/directo/deportes/tdp/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'tdp',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'live_status': 'is_live',
|
|
||||||
'title': str,
|
|
||||||
'description': str,
|
|
||||||
'thumbnail': r're:https://img2\d\.rtve\.es/resources/thumbslive/\d+\.jpg',
|
|
||||||
'timestamp': int,
|
|
||||||
'upload_date': str,
|
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'live stream'},
|
|
||||||
}, {
|
|
||||||
'url': 'http://www.rtve.es/play/videos/directo/canales-lineales/la-1/',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
mobj = self._match_valid_url(url)
|
||||||
|
video_id = mobj.group('id')
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
title = remove_end(self._og_search_title(webpage), ' en directo en RTVE.es')
|
||||||
|
title = remove_start(title, 'Estoy viendo ')
|
||||||
|
|
||||||
data_setup = self._search_json(
|
vidplayer_id = self._search_regex(
|
||||||
r'<div[^>]+class="[^"]*videoPlayer[^"]*"[^>]*data-setup=\'',
|
(r'playerId=player([0-9]+)',
|
||||||
webpage, 'data_setup', video_id)
|
r'class=["\'].*?\blive_mod\b.*?["\'][^>]+data-assetid=["\'](\d+)',
|
||||||
|
r'data-id=["\'](\d+)'),
|
||||||
formats, subtitles = self._extract_png_formats_and_subtitles(data_setup['idAsset'])
|
webpage, 'internal video ID')
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
**self._search_json_ld(webpage, video_id, fatal=False),
|
'title': title,
|
||||||
'title': self._html_extract_title(webpage),
|
'formats': self._extract_png_formats(vidplayer_id),
|
||||||
'formats': formats,
|
|
||||||
'subtitles': subtitles,
|
|
||||||
'is_live': True,
|
'is_live': True,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class RTVETelevisionIE(InfoExtractor):
|
class RTVETelevisionIE(InfoExtractor):
|
||||||
IE_NAME = 'rtve.es:television'
|
IE_NAME = 'rtve.es:television'
|
||||||
_VALID_URL = r'https?://(?:www\.)?rtve\.es/television/[^/?#]+/[^/?#]+/(?P<id>\d+).shtml'
|
_VALID_URL = r'https?://(?:www\.)?rtve\.es/television/[^/]+/[^/]+/(?P<id>\d+).shtml'
|
||||||
|
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'https://www.rtve.es/television/20091103/video-inedito-del-8o-programa/299020.shtml',
|
'url': 'http://www.rtve.es/television/20160628/revolucion-del-movil/1364141.shtml',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '572515',
|
'id': '3069778',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Clase inédita',
|
'title': 'Documentos TV - La revolución del móvil',
|
||||||
'duration': 335.817,
|
'duration': 3496.948,
|
||||||
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
|
|
||||||
'series': 'El coro de la cárcel',
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -366,8 +332,11 @@ class RTVETelevisionIE(InfoExtractor):
|
|||||||
page_id = self._match_id(url)
|
page_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, page_id)
|
webpage = self._download_webpage(url, page_id)
|
||||||
|
|
||||||
play_url = self._html_search_meta('contentUrl', webpage)
|
alacarta_url = self._search_regex(
|
||||||
if play_url is None:
|
r'data-location="alacarta_videos"[^<]+url":"(http://www\.rtve\.es/alacarta.+?)&',
|
||||||
raise ExtractorError('The webpage doesn\'t contain any video', expected=True)
|
webpage, 'alacarta url', default=None)
|
||||||
|
if alacarta_url is None:
|
||||||
|
raise ExtractorError(
|
||||||
|
'The webpage doesn\'t contain any video', expected=True)
|
||||||
|
|
||||||
return self.url_result(play_url, ie=RTVEALaCartaIE.ie_key())
|
return self.url_result(alacarta_url, ie=RTVEALaCartaIE.ie_key())
|
||||||
|
@ -9,9 +9,7 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class RTVSIE(InfoExtractor):
|
class RTVSIE(InfoExtractor):
|
||||||
IE_NAME = 'stvr'
|
_VALID_URL = r'https?://(?:www\.)?rtvs\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
|
||||||
IE_DESC = 'Slovak Television and Radio (formerly RTVS)'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?(?:rtvs|stvr)\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# radio archive
|
# radio archive
|
||||||
'url': 'http://www.rtvs.sk/radio/archiv/11224/414872',
|
'url': 'http://www.rtvs.sk/radio/archiv/11224/414872',
|
||||||
@ -21,7 +19,7 @@ class RTVSIE(InfoExtractor):
|
|||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'Ostrov pokladov 1 časť.mp3',
|
'title': 'Ostrov pokladov 1 časť.mp3',
|
||||||
'duration': 2854,
|
'duration': 2854,
|
||||||
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0000/rtvs-00009383.png',
|
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0000/b1R8.rtvs.jpg',
|
||||||
'display_id': '135331',
|
'display_id': '135331',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
@ -32,7 +30,7 @@ class RTVSIE(InfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Amaro Džives - Náš deň',
|
'title': 'Amaro Džives - Náš deň',
|
||||||
'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.',
|
'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.',
|
||||||
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
|
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
|
||||||
'timestamp': 1428555900,
|
'timestamp': 1428555900,
|
||||||
'upload_date': '20150409',
|
'upload_date': '20150409',
|
||||||
'duration': 4986,
|
'duration': 4986,
|
||||||
@ -49,11 +47,8 @@ class RTVSIE(InfoExtractor):
|
|||||||
'display_id': '307655',
|
'display_id': '307655',
|
||||||
'duration': 831,
|
'duration': 831,
|
||||||
'upload_date': '20211111',
|
'upload_date': '20211111',
|
||||||
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0916/robin.jpg',
|
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0916/robin.jpg',
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'https://www.stvr.sk/radio/archiv/11224/414872',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -7,6 +7,7 @@ from ..utils import (
|
|||||||
ExtractorError,
|
ExtractorError,
|
||||||
UnsupportedError,
|
UnsupportedError,
|
||||||
clean_html,
|
clean_html,
|
||||||
|
determine_ext,
|
||||||
extract_attributes,
|
extract_attributes,
|
||||||
format_field,
|
format_field,
|
||||||
get_element_by_class,
|
get_element_by_class,
|
||||||
@ -35,7 +36,7 @@ class RumbleEmbedIE(InfoExtractor):
|
|||||||
'upload_date': '20191020',
|
'upload_date': '20191020',
|
||||||
'channel_url': 'https://rumble.com/c/WMAR',
|
'channel_url': 'https://rumble.com/c/WMAR',
|
||||||
'channel': 'WMAR',
|
'channel': 'WMAR',
|
||||||
'thumbnail': r're:https://.+\.jpg',
|
'thumbnail': 'https://sp.rmbl.ws/s8/1/5/M/z/1/5Mz1a.qR4e-small-WMAR-2-News-Latest-Headline.jpg',
|
||||||
'duration': 234,
|
'duration': 234,
|
||||||
'uploader': 'WMAR',
|
'uploader': 'WMAR',
|
||||||
'live_status': 'not_live',
|
'live_status': 'not_live',
|
||||||
@ -51,7 +52,7 @@ class RumbleEmbedIE(InfoExtractor):
|
|||||||
'upload_date': '20220217',
|
'upload_date': '20220217',
|
||||||
'channel_url': 'https://rumble.com/c/CyberTechNews',
|
'channel_url': 'https://rumble.com/c/CyberTechNews',
|
||||||
'channel': 'CTNews',
|
'channel': 'CTNews',
|
||||||
'thumbnail': r're:https://.+\.jpg',
|
'thumbnail': 'https://sp.rmbl.ws/s8/6/7/i/9/h/7i9hd.OvCc.jpg',
|
||||||
'duration': 901,
|
'duration': 901,
|
||||||
'uploader': 'CTNews',
|
'uploader': 'CTNews',
|
||||||
'live_status': 'not_live',
|
'live_status': 'not_live',
|
||||||
@ -113,22 +114,6 @@ class RumbleEmbedIE(InfoExtractor):
|
|||||||
'live_status': 'was_live',
|
'live_status': 'was_live',
|
||||||
},
|
},
|
||||||
'params': {'skip_download': True},
|
'params': {'skip_download': True},
|
||||||
}, {
|
|
||||||
'url': 'https://rumble.com/embed/v6pezdb',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'v6pezdb',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': '"Es war einmal ein Mädchen" – Ein filmisches Zeitzeugnis aus Leningrad 1944',
|
|
||||||
'uploader': 'RT DE',
|
|
||||||
'channel': 'RT DE',
|
|
||||||
'channel_url': 'https://rumble.com/c/RTDE',
|
|
||||||
'duration': 309,
|
|
||||||
'thumbnail': 'https://1a-1791.com/video/fww1/dc/s8/1/n/z/2/y/nz2yy.qR4e-small-Es-war-einmal-ein-Mdchen-Ei.jpg',
|
|
||||||
'timestamp': 1743703500,
|
|
||||||
'upload_date': '20250403',
|
|
||||||
'live_status': 'not_live',
|
|
||||||
},
|
|
||||||
'params': {'skip_download': True},
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://rumble.com/embed/ufe9n.v5pv5f',
|
'url': 'https://rumble.com/embed/ufe9n.v5pv5f',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -183,42 +168,40 @@ class RumbleEmbedIE(InfoExtractor):
|
|||||||
live_status = None
|
live_status = None
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for format_type, format_info in (video.get('ua') or {}).items():
|
for ext, ext_info in (video.get('ua') or {}).items():
|
||||||
if isinstance(format_info, dict):
|
if isinstance(ext_info, dict):
|
||||||
for height, video_info in format_info.items():
|
for height, video_info in ext_info.items():
|
||||||
if not traverse_obj(video_info, ('meta', 'h', {int_or_none})):
|
if not traverse_obj(video_info, ('meta', 'h', {int_or_none})):
|
||||||
video_info.setdefault('meta', {})['h'] = height
|
video_info.setdefault('meta', {})['h'] = height
|
||||||
format_info = format_info.values()
|
ext_info = ext_info.values()
|
||||||
|
|
||||||
for video_info in format_info:
|
for video_info in ext_info:
|
||||||
meta = video_info.get('meta') or {}
|
meta = video_info.get('meta') or {}
|
||||||
if not video_info.get('url'):
|
if not video_info.get('url'):
|
||||||
continue
|
continue
|
||||||
# With default query params returns m3u8 variants which are duplicates, without returns tar files
|
if ext == 'hls':
|
||||||
if format_type == 'tar':
|
|
||||||
continue
|
|
||||||
if format_type == 'hls':
|
|
||||||
if meta.get('live') is True and video.get('live') == 1:
|
if meta.get('live') is True and video.get('live') == 1:
|
||||||
live_status = 'post_live'
|
live_status = 'post_live'
|
||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
video_info['url'], video_id,
|
video_info['url'], video_id,
|
||||||
ext='mp4', m3u8_id='hls', fatal=False, live=live_status == 'is_live'))
|
ext='mp4', m3u8_id='hls', fatal=False, live=live_status == 'is_live'))
|
||||||
continue
|
continue
|
||||||
is_timeline = format_type == 'timeline'
|
timeline = ext == 'timeline'
|
||||||
is_audio = format_type == 'audio'
|
if timeline:
|
||||||
|
ext = determine_ext(video_info['url'])
|
||||||
formats.append({
|
formats.append({
|
||||||
'acodec': 'none' if is_timeline else None,
|
'ext': ext,
|
||||||
'vcodec': 'none' if is_audio else None,
|
'acodec': 'none' if timeline else None,
|
||||||
'url': video_info['url'],
|
'url': video_info['url'],
|
||||||
'format_id': join_nonempty(format_type, format_field(meta, 'h', '%sp')),
|
'format_id': join_nonempty(ext, format_field(meta, 'h', '%sp')),
|
||||||
'format_note': 'Timeline' if is_timeline else None,
|
'format_note': 'Timeline' if timeline else None,
|
||||||
'fps': None if is_timeline or is_audio else video.get('fps'),
|
'fps': None if timeline else video.get('fps'),
|
||||||
**traverse_obj(meta, {
|
**traverse_obj(meta, {
|
||||||
'tbr': ('bitrate', {int_or_none}),
|
'tbr': 'bitrate',
|
||||||
'filesize': ('size', {int_or_none}),
|
'filesize': 'size',
|
||||||
'width': ('w', {int_or_none}),
|
'width': 'w',
|
||||||
'height': ('h', {int_or_none}),
|
'height': 'h',
|
||||||
}),
|
}, expected_type=lambda x: int(x) or None),
|
||||||
})
|
})
|
||||||
|
|
||||||
subtitles = {
|
subtitles = {
|
||||||
|
@ -122,15 +122,6 @@ class SBSIE(InfoExtractor):
|
|||||||
if traverse_obj(media, ('partOfSeries', {dict})):
|
if traverse_obj(media, ('partOfSeries', {dict})):
|
||||||
media['epName'] = traverse_obj(media, ('title', {str}))
|
media['epName'] = traverse_obj(media, ('title', {str}))
|
||||||
|
|
||||||
# Need to set different language for forced subs or else they have priority over full subs
|
|
||||||
fixed_subtitles = {}
|
|
||||||
for lang, subs in subtitles.items():
|
|
||||||
for sub in subs:
|
|
||||||
fixed_lang = lang
|
|
||||||
if sub['url'].lower().endswith('_fe.vtt'):
|
|
||||||
fixed_lang += '-forced'
|
|
||||||
fixed_subtitles.setdefault(fixed_lang, []).append(sub)
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
**traverse_obj(media, {
|
**traverse_obj(media, {
|
||||||
@ -160,6 +151,6 @@ class SBSIE(InfoExtractor):
|
|||||||
}),
|
}),
|
||||||
}),
|
}),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': fixed_subtitles,
|
'subtitles': subtitles,
|
||||||
'uploader': 'SBSC',
|
'uploader': 'SBSC',
|
||||||
}
|
}
|
||||||
|
@ -513,7 +513,7 @@ class TVPVODBaseIE(InfoExtractor):
|
|||||||
|
|
||||||
class TVPVODVideoIE(TVPVODBaseIE):
|
class TVPVODVideoIE(TVPVODBaseIE):
|
||||||
IE_NAME = 'tvp:vod'
|
IE_NAME = 'tvp:vod'
|
||||||
_VALID_URL = r'https?://vod\.tvp\.pl/(?P<category>[a-z\d-]+,\d+)/[a-z\d-]+(?<!-odcinki)(?:-odcinki,\d+/odcinek--?\d+,S-?\d+E-?\d+)?,(?P<id>\d+)/?(?:[?#]|$)'
|
_VALID_URL = r'https?://vod\.tvp\.pl/(?P<category>[a-z\d-]+,\d+)/[a-z\d-]+(?<!-odcinki)(?:-odcinki,\d+/odcinek-\d+,S\d+E\d+)?,(?P<id>\d+)/?(?:[?#]|$)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://vod.tvp.pl/dla-dzieci,24/laboratorium-alchemika-odcinki,309338/odcinek-24,S01E24,311357',
|
'url': 'https://vod.tvp.pl/dla-dzieci,24/laboratorium-alchemika-odcinki,309338/odcinek-24,S01E24,311357',
|
||||||
@ -568,9 +568,6 @@ class TVPVODVideoIE(TVPVODBaseIE):
|
|||||||
'live_status': 'is_live',
|
'live_status': 'is_live',
|
||||||
'thumbnail': 're:https?://.+',
|
'thumbnail': 're:https?://.+',
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
'url': 'https://vod.tvp.pl/informacje-i-publicystyka,205/konskie-2025-debata-przedwyborcza-odcinki,2028435/odcinek--1,S01E-1,2028419',
|
|
||||||
'only_matching': True,
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
@ -1,21 +1,13 @@
|
|||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import clean_html, remove_end, unified_timestamp, url_or_none
|
||||||
clean_html,
|
from ..utils.traversal import traverse_obj
|
||||||
extract_attributes,
|
|
||||||
parse_qs,
|
|
||||||
remove_end,
|
|
||||||
require,
|
|
||||||
unified_timestamp,
|
|
||||||
url_or_none,
|
|
||||||
)
|
|
||||||
from ..utils.traversal import find_element, traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class TvwIE(InfoExtractor):
|
class TvwIE(InfoExtractor):
|
||||||
IE_NAME = 'tvw'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?tvw\.org/video/(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:www\.)?tvw\.org/video/(?P<id>[^/?#]+)'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://tvw.org/video/billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211/',
|
'url': 'https://tvw.org/video/billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211/',
|
||||||
'md5': '9ceb94fe2bb7fd726f74f16356825703',
|
'md5': '9ceb94fe2bb7fd726f74f16356825703',
|
||||||
@ -123,43 +115,3 @@ class TvwIE(InfoExtractor):
|
|||||||
'is_live': ('eventStatus', {lambda x: x == 'live'}),
|
'is_live': ('eventStatus', {lambda x: x == 'live'}),
|
||||||
}),
|
}),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class TvwTvChannelsIE(InfoExtractor):
|
|
||||||
IE_NAME = 'tvw:tvchannels'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?tvw\.org/tvchannels/(?P<id>[^/?#]+)'
|
|
||||||
_TESTS = [{
|
|
||||||
'url': 'https://tvw.org/tvchannels/air/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'air',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': r're:TVW Cable Channel Live Stream',
|
|
||||||
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)$',
|
|
||||||
'live_status': 'is_live',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://tvw.org/tvchannels/tvw2/',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'tvw2',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': r're:TVW-2 Broadcast Channel',
|
|
||||||
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)$',
|
|
||||||
'live_status': 'is_live',
|
|
||||||
},
|
|
||||||
}]
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
|
|
||||||
m3u8_url = traverse_obj(webpage, (
|
|
||||||
{find_element(id='invintus-persistent-stream-frame', html=True)}, {extract_attributes},
|
|
||||||
'src', {parse_qs}, 'encoder', 0, {json.loads}, 'live247URI', {url_or_none}, {require('stream url')}))
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'formats': self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', m3u8_id='hls', live=True),
|
|
||||||
'title': remove_end(self._og_search_title(webpage, default=None), ' - TVW'),
|
|
||||||
'thumbnail': self._og_search_thumbnail(webpage, default=None),
|
|
||||||
'is_live': True,
|
|
||||||
}
|
|
||||||
|
@ -14,20 +14,19 @@ from ..utils import (
|
|||||||
dict_get,
|
dict_get,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
join_nonempty,
|
|
||||||
make_archive_id,
|
make_archive_id,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
parse_qs,
|
parse_qs,
|
||||||
qualities,
|
qualities,
|
||||||
str_or_none,
|
str_or_none,
|
||||||
|
traverse_obj,
|
||||||
try_get,
|
try_get,
|
||||||
unified_timestamp,
|
unified_timestamp,
|
||||||
update_url_query,
|
update_url_query,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
urljoin,
|
urljoin,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj, value
|
|
||||||
|
|
||||||
|
|
||||||
class TwitchBaseIE(InfoExtractor):
|
class TwitchBaseIE(InfoExtractor):
|
||||||
@ -43,10 +42,10 @@ class TwitchBaseIE(InfoExtractor):
|
|||||||
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
|
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
|
||||||
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
|
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
|
||||||
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
|
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
|
||||||
'ShareClipRenderStatus': 'f130048a462a0ac86bb54d653c968c514e9ab9ca94db52368c1179e97b0f16eb',
|
|
||||||
'ChannelCollectionsContent': '447aec6a0cc1e8d0a8d7732d47eb0762c336a2294fdb009e9c9d854e49d484b9',
|
'ChannelCollectionsContent': '447aec6a0cc1e8d0a8d7732d47eb0762c336a2294fdb009e9c9d854e49d484b9',
|
||||||
'StreamMetadata': 'a647c2a13599e5991e175155f798ca7f1ecddde73f7f341f39009c14dbf59962',
|
'StreamMetadata': 'a647c2a13599e5991e175155f798ca7f1ecddde73f7f341f39009c14dbf59962',
|
||||||
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
|
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
|
||||||
|
'VideoAccessToken_Clip': '36b89d2507fce29e5ca551df756d27c1cfe079e2609642b4390aa4c35796eb11',
|
||||||
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
|
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
|
||||||
'VideoMetadata': '49b5b8f268cdeb259d75b58dcb0c1a748e3b575003448a2333dc5cdafd49adad',
|
'VideoMetadata': '49b5b8f268cdeb259d75b58dcb0c1a748e3b575003448a2333dc5cdafd49adad',
|
||||||
'VideoPlayer_ChapterSelectButtonVideo': '8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41',
|
'VideoPlayer_ChapterSelectButtonVideo': '8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41',
|
||||||
@ -1084,44 +1083,16 @@ class TwitchClipsIE(TwitchBaseIE):
|
|||||||
'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat',
|
'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat',
|
||||||
'md5': '761769e1eafce0ffebfb4089cb3847cd',
|
'md5': '761769e1eafce0ffebfb4089cb3847cd',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '396245304',
|
'id': '42850523',
|
||||||
'display_id': 'FaintLightGullWholeWheat',
|
'display_id': 'FaintLightGullWholeWheat',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'EA Play 2016 Live from the Novo Theatre',
|
'title': 'EA Play 2016 Live from the Novo Theatre',
|
||||||
'duration': 32,
|
|
||||||
'view_count': int,
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg',
|
'thumbnail': r're:^https?://.*\.jpg',
|
||||||
'timestamp': 1465767393,
|
'timestamp': 1465767393,
|
||||||
'upload_date': '20160612',
|
'upload_date': '20160612',
|
||||||
'creators': ['EA'],
|
'creator': 'EA',
|
||||||
'channel': 'EA',
|
'uploader': 'stereotype_',
|
||||||
'channel_id': '25163635',
|
'uploader_id': '43566419',
|
||||||
'channel_is_verified': False,
|
|
||||||
'channel_follower_count': int,
|
|
||||||
'uploader': 'EA',
|
|
||||||
'uploader_id': '25163635',
|
|
||||||
},
|
|
||||||
}, {
|
|
||||||
'url': 'https://www.twitch.tv/xqc/clip/CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
|
|
||||||
'md5': 'e90fe616b36e722a8cfa562547c543f0',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '3207364882',
|
|
||||||
'display_id': 'CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'A day in the life of xQc',
|
|
||||||
'duration': 60,
|
|
||||||
'view_count': int,
|
|
||||||
'thumbnail': r're:^https?://.*\.jpg',
|
|
||||||
'timestamp': 1742869615,
|
|
||||||
'upload_date': '20250325',
|
|
||||||
'creators': ['xQc'],
|
|
||||||
'channel': 'xQc',
|
|
||||||
'channel_id': '71092938',
|
|
||||||
'channel_is_verified': True,
|
|
||||||
'channel_follower_count': int,
|
|
||||||
'uploader': 'xQc',
|
|
||||||
'uploader_id': '71092938',
|
|
||||||
'categories': ['Just Chatting'],
|
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# multiple formats
|
# multiple formats
|
||||||
@ -1145,14 +1116,16 @@ class TwitchClipsIE(TwitchBaseIE):
|
|||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
slug = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
clip = self._download_gql(
|
clip = self._download_gql(
|
||||||
slug, [{
|
video_id, [{
|
||||||
'operationName': 'ShareClipRenderStatus',
|
'operationName': 'VideoAccessToken_Clip',
|
||||||
'variables': {'slug': slug},
|
'variables': {
|
||||||
|
'slug': video_id,
|
||||||
|
},
|
||||||
}],
|
}],
|
||||||
'Downloading clip GraphQL')[0]['data']['clip']
|
'Downloading clip access token GraphQL')[0]['data']['clip']
|
||||||
|
|
||||||
if not clip:
|
if not clip:
|
||||||
raise ExtractorError(
|
raise ExtractorError(
|
||||||
@ -1162,71 +1135,81 @@ class TwitchClipsIE(TwitchBaseIE):
|
|||||||
'sig': clip['playbackAccessToken']['signature'],
|
'sig': clip['playbackAccessToken']['signature'],
|
||||||
'token': clip['playbackAccessToken']['value'],
|
'token': clip['playbackAccessToken']['value'],
|
||||||
}
|
}
|
||||||
asset_default = traverse_obj(clip, ('assets', 0, {dict})) or {}
|
|
||||||
asset_portrait = traverse_obj(clip, ('assets', 1, {dict})) or {}
|
data = self._download_base_gql(
|
||||||
|
video_id, {
|
||||||
|
'query': '''{
|
||||||
|
clip(slug: "%s") {
|
||||||
|
broadcaster {
|
||||||
|
displayName
|
||||||
|
}
|
||||||
|
createdAt
|
||||||
|
curator {
|
||||||
|
displayName
|
||||||
|
id
|
||||||
|
}
|
||||||
|
durationSeconds
|
||||||
|
id
|
||||||
|
tiny: thumbnailURL(width: 86, height: 45)
|
||||||
|
small: thumbnailURL(width: 260, height: 147)
|
||||||
|
medium: thumbnailURL(width: 480, height: 272)
|
||||||
|
title
|
||||||
|
videoQualities {
|
||||||
|
frameRate
|
||||||
|
quality
|
||||||
|
sourceURL
|
||||||
|
}
|
||||||
|
viewCount
|
||||||
|
}
|
||||||
|
}''' % video_id}, 'Downloading clip GraphQL', fatal=False) # noqa: UP031
|
||||||
|
|
||||||
|
if data:
|
||||||
|
clip = try_get(data, lambda x: x['data']['clip'], dict) or clip
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
default_aspect_ratio = float_or_none(asset_default.get('aspectRatio'))
|
for option in clip.get('videoQualities', []):
|
||||||
formats.extend(traverse_obj(asset_default, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']), {
|
if not isinstance(option, dict):
|
||||||
'url': ('sourceURL', {update_url_query(query=access_query)}),
|
continue
|
||||||
'format_id': ('quality', {str}),
|
source = url_or_none(option.get('sourceURL'))
|
||||||
'height': ('quality', {int_or_none}),
|
if not source:
|
||||||
'fps': ('frameRate', {float_or_none}),
|
continue
|
||||||
'aspect_ratio': {value(default_aspect_ratio)},
|
|
||||||
})))
|
|
||||||
portrait_aspect_ratio = float_or_none(asset_portrait.get('aspectRatio'))
|
|
||||||
for source in traverse_obj(asset_portrait, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']))):
|
|
||||||
formats.append({
|
formats.append({
|
||||||
'url': update_url_query(source['sourceURL'], access_query),
|
'url': update_url_query(source, access_query),
|
||||||
'format_id': join_nonempty('portrait', source.get('quality')),
|
'format_id': option.get('quality'),
|
||||||
'height': int_or_none(source.get('quality')),
|
'height': int_or_none(option.get('quality')),
|
||||||
'fps': float_or_none(source.get('frameRate')),
|
'fps': int_or_none(option.get('frameRate')),
|
||||||
'aspect_ratio': portrait_aspect_ratio,
|
|
||||||
'quality': -2,
|
|
||||||
})
|
})
|
||||||
|
|
||||||
thumbnails = []
|
thumbnails = []
|
||||||
thumb_asset_default_url = url_or_none(asset_default.get('thumbnailURL'))
|
for thumbnail_id in ('tiny', 'small', 'medium'):
|
||||||
if thumb_asset_default_url:
|
thumbnail_url = clip.get(thumbnail_id)
|
||||||
thumbnails.append({
|
if not thumbnail_url:
|
||||||
'id': 'default',
|
continue
|
||||||
'url': thumb_asset_default_url,
|
thumb = {
|
||||||
'preference': 0,
|
'id': thumbnail_id,
|
||||||
})
|
'url': thumbnail_url,
|
||||||
if thumb_asset_portrait_url := url_or_none(asset_portrait.get('thumbnailURL')):
|
}
|
||||||
thumbnails.append({
|
mobj = re.search(r'-(\d+)x(\d+)\.', thumbnail_url)
|
||||||
'id': 'portrait',
|
if mobj:
|
||||||
'url': thumb_asset_portrait_url,
|
thumb.update({
|
||||||
'preference': -1,
|
'height': int(mobj.group(2)),
|
||||||
})
|
'width': int(mobj.group(1)),
|
||||||
thumb_default_url = url_or_none(clip.get('thumbnailURL'))
|
})
|
||||||
if thumb_default_url and thumb_default_url != thumb_asset_default_url:
|
thumbnails.append(thumb)
|
||||||
thumbnails.append({
|
|
||||||
'id': 'small',
|
|
||||||
'url': thumb_default_url,
|
|
||||||
'preference': -2,
|
|
||||||
})
|
|
||||||
|
|
||||||
old_id = self._search_regex(r'%7C(\d+)(?:-\d+)?.mp4', formats[-1]['url'], 'old id', default=None)
|
old_id = self._search_regex(r'%7C(\d+)(?:-\d+)?.mp4', formats[-1]['url'], 'old id', default=None)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': clip.get('id') or slug,
|
'id': clip.get('id') or video_id,
|
||||||
'_old_archive_ids': [make_archive_id(self, old_id)] if old_id else None,
|
'_old_archive_ids': [make_archive_id(self, old_id)] if old_id else None,
|
||||||
'display_id': slug,
|
'display_id': video_id,
|
||||||
|
'title': clip.get('title'),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'duration': int_or_none(clip.get('durationSeconds')),
|
||||||
|
'view_count': int_or_none(clip.get('viewCount')),
|
||||||
|
'timestamp': unified_timestamp(clip.get('createdAt')),
|
||||||
'thumbnails': thumbnails,
|
'thumbnails': thumbnails,
|
||||||
**traverse_obj(clip, {
|
'creator': try_get(clip, lambda x: x['broadcaster']['displayName'], str),
|
||||||
'title': ('title', {str}),
|
'uploader': try_get(clip, lambda x: x['curator']['displayName'], str),
|
||||||
'duration': ('durationSeconds', {int_or_none}),
|
'uploader_id': try_get(clip, lambda x: x['curator']['id'], str),
|
||||||
'view_count': ('viewCount', {int_or_none}),
|
|
||||||
'timestamp': ('createdAt', {parse_iso8601}),
|
|
||||||
'creators': ('broadcaster', 'displayName', {str}, filter, all),
|
|
||||||
'channel': ('broadcaster', 'displayName', {str}),
|
|
||||||
'channel_id': ('broadcaster', 'id', {str}),
|
|
||||||
'channel_follower_count': ('broadcaster', 'followers', 'totalCount', {int_or_none}),
|
|
||||||
'channel_is_verified': ('broadcaster', 'isPartner', {bool}),
|
|
||||||
'uploader': ('broadcaster', 'displayName', {str}),
|
|
||||||
'uploader_id': ('broadcaster', 'id', {str}),
|
|
||||||
'categories': ('game', 'displayName', {str}, filter, all, filter),
|
|
||||||
}),
|
|
||||||
}
|
}
|
||||||
|
@ -544,7 +544,7 @@ class VKIE(VKBaseIE):
|
|||||||
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
|
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
|
||||||
'duration': ('duration', {int_or_none}),
|
'duration': ('duration', {int_or_none}),
|
||||||
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
|
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
|
||||||
'title': ('text', {unescapeHTML}),
|
'title': ('text', {str}),
|
||||||
'start_time': 'time',
|
'start_time': 'time',
|
||||||
}),
|
}),
|
||||||
}),
|
}),
|
||||||
|
@ -2,17 +2,15 @@ import itertools
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
bug_reports_message,
|
|
||||||
determine_ext,
|
determine_ext,
|
||||||
|
extract_attributes,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
lowercase_escape,
|
lowercase_escape,
|
||||||
parse_qs,
|
parse_qs,
|
||||||
qualities,
|
traverse_obj,
|
||||||
try_get,
|
try_get,
|
||||||
update_url_query,
|
|
||||||
url_or_none,
|
url_or_none,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
|
||||||
|
|
||||||
|
|
||||||
class YandexVideoIE(InfoExtractor):
|
class YandexVideoIE(InfoExtractor):
|
||||||
@ -188,22 +186,7 @@ class YandexVideoPreviewIE(InfoExtractor):
|
|||||||
return self.url_result(data_json['video']['url'])
|
return self.url_result(data_json['video']['url'])
|
||||||
|
|
||||||
|
|
||||||
class ZenYandexBaseIE(InfoExtractor):
|
class ZenYandexIE(InfoExtractor):
|
||||||
def _fetch_ssr_data(self, url, video_id):
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
redirect = self._search_json(
|
|
||||||
r'(?:var|let|const)\s+it\s*=', webpage, 'redirect', video_id, default={}).get('retpath')
|
|
||||||
if redirect:
|
|
||||||
video_id = self._match_id(redirect)
|
|
||||||
webpage = self._download_webpage(redirect, video_id, note='Redirecting')
|
|
||||||
return video_id, self._search_json(
|
|
||||||
r'(?:var|let|const)\s+_params\s*=\s*\(', webpage, 'metadata', video_id,
|
|
||||||
contains_pattern=r'{["\']ssrData.+}')['ssrData']
|
|
||||||
|
|
||||||
|
|
||||||
class ZenYandexIE(ZenYandexBaseIE):
|
|
||||||
IE_NAME = 'dzen.ru'
|
|
||||||
IE_DESC = 'Дзен (dzen) formerly Яндекс.Дзен (Yandex Zen)'
|
|
||||||
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru(?:/video)?/(media|watch)/(?:(?:id/[^/]+/|[^/]+/)(?:[a-z0-9-]+)-)?(?P<id>[a-z0-9-]+)'
|
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru(?:/video)?/(media|watch)/(?:(?:id/[^/]+/|[^/]+/)(?:[a-z0-9-]+)-)?(?P<id>[a-z0-9-]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://zen.yandex.ru/media/id/606fd806cc13cb3c58c05cf5/vot-eto-focus-dedy-morozy-na-gidrociklah-60c7c443da18892ebfe85ed7',
|
'url': 'https://zen.yandex.ru/media/id/606fd806cc13cb3c58c05cf5/vot-eto-focus-dedy-morozy-na-gidrociklah-60c7c443da18892ebfe85ed7',
|
||||||
@ -233,7 +216,6 @@ class ZenYandexIE(ZenYandexBaseIE):
|
|||||||
'timestamp': 1573465585,
|
'timestamp': 1573465585,
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'm3u8'},
|
'params': {'skip_download': 'm3u8'},
|
||||||
'skip': 'The page does not exist',
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://zen.yandex.ru/video/watch/6002240ff8b1af50bb2da5e3',
|
'url': 'https://zen.yandex.ru/video/watch/6002240ff8b1af50bb2da5e3',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -245,9 +227,6 @@ class ZenYandexIE(ZenYandexBaseIE):
|
|||||||
'uploader': 'TechInsider',
|
'uploader': 'TechInsider',
|
||||||
'timestamp': 1611378221,
|
'timestamp': 1611378221,
|
||||||
'upload_date': '20210123',
|
'upload_date': '20210123',
|
||||||
'view_count': int,
|
|
||||||
'duration': 243,
|
|
||||||
'tags': ['опыт', 'эксперимент', 'огонь'],
|
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'm3u8'},
|
'params': {'skip_download': 'm3u8'},
|
||||||
}, {
|
}, {
|
||||||
@ -261,9 +240,6 @@ class ZenYandexIE(ZenYandexBaseIE):
|
|||||||
'uploader': 'TechInsider',
|
'uploader': 'TechInsider',
|
||||||
'upload_date': '20210123',
|
'upload_date': '20210123',
|
||||||
'timestamp': 1611378221,
|
'timestamp': 1611378221,
|
||||||
'view_count': int,
|
|
||||||
'duration': 243,
|
|
||||||
'tags': ['опыт', 'эксперимент', 'огонь'],
|
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'm3u8'},
|
'params': {'skip_download': 'm3u8'},
|
||||||
}, {
|
}, {
|
||||||
@ -276,56 +252,44 @@ class ZenYandexIE(ZenYandexBaseIE):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
video_id, ssr_data = self._fetch_ssr_data(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
video_data = ssr_data['videoMetaResponse']
|
redirect = self._search_json(r'var it\s*=', webpage, 'redirect', id, default={}).get('retpath')
|
||||||
|
if redirect:
|
||||||
|
video_id = self._match_id(redirect)
|
||||||
|
webpage = self._download_webpage(redirect, video_id, note='Redirecting')
|
||||||
|
data_json = self._search_json(
|
||||||
|
r'("data"\s*:|data\s*=)', webpage, 'metadata', video_id, contains_pattern=r'{["\']_*serverState_*video.+}')
|
||||||
|
serverstate = self._search_regex(r'(_+serverState_+video-site_[^_]+_+)', webpage, 'server state')
|
||||||
|
uploader = self._search_regex(r'(<a\s*class=["\']card-channel-link[^"\']+["\'][^>]+>)',
|
||||||
|
webpage, 'uploader', default='<a>')
|
||||||
|
uploader_name = extract_attributes(uploader).get('aria-label')
|
||||||
|
item_id = traverse_obj(data_json, (serverstate, 'videoViewer', 'openedItemId', {str}))
|
||||||
|
video_json = traverse_obj(data_json, (serverstate, 'videoViewer', 'items', item_id, {dict})) or {}
|
||||||
|
|
||||||
formats, subtitles = [], {}
|
formats, subtitles = [], {}
|
||||||
quality = qualities(('4', '0', '1', '2', '3', '5', '6', '7'))
|
for s_url in traverse_obj(video_json, ('video', 'streams', ..., {url_or_none})):
|
||||||
# Deduplicate stream URLs. The "dzen_dash" query parameter is present in some URLs but can be omitted
|
|
||||||
stream_urls = set(traverse_obj(video_data, (
|
|
||||||
'video', ('id', ('streams', ...), ('mp4Streams', ..., 'url'), ('oneVideoStreams', ..., 'url')),
|
|
||||||
{url_or_none}, {update_url_query(query={'dzen_dash': []})})))
|
|
||||||
for s_url in stream_urls:
|
|
||||||
ext = determine_ext(s_url)
|
ext = determine_ext(s_url)
|
||||||
content_type = traverse_obj(parse_qs(s_url), ('ct', 0))
|
if ext == 'mpd':
|
||||||
if ext == 'mpd' or content_type == '6':
|
fmts, subs = self._extract_mpd_formats_and_subtitles(s_url, video_id, mpd_id='dash')
|
||||||
fmts, subs = self._extract_mpd_formats_and_subtitles(s_url, video_id, mpd_id='dash', fatal=False)
|
elif ext == 'm3u8':
|
||||||
elif ext == 'm3u8' or content_type == '8':
|
fmts, subs = self._extract_m3u8_formats_and_subtitles(s_url, video_id, 'mp4')
|
||||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(s_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
|
||||||
elif content_type == '0':
|
|
||||||
format_type = traverse_obj(parse_qs(s_url), ('type', 0))
|
|
||||||
formats.append({
|
|
||||||
'url': s_url,
|
|
||||||
'format_id': format_type,
|
|
||||||
'ext': 'mp4',
|
|
||||||
'quality': quality(format_type),
|
|
||||||
})
|
|
||||||
continue
|
|
||||||
else:
|
|
||||||
self.report_warning(f'Unsupported stream URL: {s_url}{bug_reports_message()}')
|
|
||||||
continue
|
|
||||||
formats.extend(fmts)
|
formats.extend(fmts)
|
||||||
self._merge_subtitles(subs, target=subtitles)
|
subtitles = self._merge_subtitles(subtitles, subs)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
'title': video_json.get('title') or self._og_search_title(webpage),
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
**traverse_obj(video_data, {
|
'duration': int_or_none(video_json.get('duration')),
|
||||||
'title': ('title', {str}),
|
'view_count': int_or_none(video_json.get('views')),
|
||||||
'description': ('description', {str}),
|
'timestamp': int_or_none(video_json.get('publicationDate')),
|
||||||
'thumbnail': ('image', {url_or_none}),
|
'uploader': uploader_name or data_json.get('authorName') or try_get(data_json, lambda x: x['publisher']['name']),
|
||||||
'duration': ('video', 'duration', {int_or_none}),
|
'description': video_json.get('description') or self._og_search_description(webpage),
|
||||||
'view_count': ('video', 'views', {int_or_none}),
|
'thumbnail': self._og_search_thumbnail(webpage) or try_get(data_json, lambda x: x['og']['imageUrl']),
|
||||||
'timestamp': ('publicationDate', {int_or_none}),
|
|
||||||
'tags': ('tags', ..., {str}),
|
|
||||||
'uploader': ('source', 'title', {str}),
|
|
||||||
}),
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class ZenYandexChannelIE(ZenYandexBaseIE):
|
class ZenYandexChannelIE(InfoExtractor):
|
||||||
IE_NAME = 'dzen.ru:channel'
|
|
||||||
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru/(?!media|video)(?:id/)?(?P<id>[a-z0-9-_]+)'
|
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru/(?!media|video)(?:id/)?(?P<id>[a-z0-9-_]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://zen.yandex.ru/tok_media',
|
'url': 'https://zen.yandex.ru/tok_media',
|
||||||
@ -359,8 +323,8 @@ class ZenYandexChannelIE(ZenYandexBaseIE):
|
|||||||
'url': 'https://zen.yandex.ru/jony_me',
|
'url': 'https://zen.yandex.ru/jony_me',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'jony_me',
|
'id': 'jony_me',
|
||||||
'description': 'md5:7c30d11dc005faba8826feae99da3113',
|
'description': 'md5:ce0a5cad2752ab58701b5497835b2cc5',
|
||||||
'title': 'JONY',
|
'title': 'JONY ',
|
||||||
},
|
},
|
||||||
'playlist_count': 18,
|
'playlist_count': 18,
|
||||||
}, {
|
}, {
|
||||||
@ -369,8 +333,9 @@ class ZenYandexChannelIE(ZenYandexBaseIE):
|
|||||||
'url': 'https://zen.yandex.ru/tatyanareva',
|
'url': 'https://zen.yandex.ru/tatyanareva',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'tatyanareva',
|
'id': 'tatyanareva',
|
||||||
'description': 'md5:92e56fa730a932ca2483ba5c2186ad96',
|
'description': 'md5:40a1e51f174369ec3ba9d657734ac31f',
|
||||||
'title': 'Татьяна Рева',
|
'title': 'Татьяна Рева',
|
||||||
|
'entries': 'maxcount:200',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 46,
|
'playlist_mincount': 46,
|
||||||
}, {
|
}, {
|
||||||
@ -383,31 +348,43 @@ class ZenYandexChannelIE(ZenYandexBaseIE):
|
|||||||
'playlist_mincount': 657,
|
'playlist_mincount': 657,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _entries(self, feed_data, channel_id):
|
def _entries(self, item_id, server_state_json, server_settings_json):
|
||||||
|
items = (traverse_obj(server_state_json, ('feed', 'items', ...))
|
||||||
|
or traverse_obj(server_settings_json, ('exportData', 'items', ...)))
|
||||||
|
|
||||||
|
more = (traverse_obj(server_state_json, ('links', 'more'))
|
||||||
|
or traverse_obj(server_settings_json, ('exportData', 'more', 'link')))
|
||||||
|
|
||||||
next_page_id = None
|
next_page_id = None
|
||||||
for page in itertools.count(1):
|
for page in itertools.count(1):
|
||||||
for item in traverse_obj(feed_data, (
|
for item in items or []:
|
||||||
(None, ('items', lambda _, v: v['tab'] in ('shorts', 'longs'))),
|
if item.get('type') != 'gif':
|
||||||
'items', lambda _, v: url_or_none(v['link']),
|
continue
|
||||||
)):
|
video_id = traverse_obj(item, 'publication_id', 'publicationId') or ''
|
||||||
yield self.url_result(item['link'], ZenYandexIE, item.get('id'), title=item.get('title'))
|
yield self.url_result(item['link'], ZenYandexIE, video_id.split(':')[-1])
|
||||||
|
|
||||||
more = traverse_obj(feed_data, ('more', 'link', {url_or_none}))
|
|
||||||
current_page_id = next_page_id
|
current_page_id = next_page_id
|
||||||
next_page_id = traverse_obj(parse_qs(more), ('next_page_id', -1))
|
next_page_id = traverse_obj(parse_qs(more), ('next_page_id', -1))
|
||||||
if not all((more, next_page_id, next_page_id != current_page_id)):
|
if not all((more, items, next_page_id, next_page_id != current_page_id)):
|
||||||
break
|
break
|
||||||
|
|
||||||
feed_data = self._download_json(more, channel_id, note=f'Downloading Page {page}')
|
data = self._download_json(more, item_id, note=f'Downloading Page {page}')
|
||||||
|
items, more = data.get('items'), traverse_obj(data, ('more', 'link'))
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
channel_id = self._match_id(url)
|
item_id = self._match_id(url)
|
||||||
channel_id, ssr_data = self._fetch_ssr_data(url, channel_id)
|
webpage = self._download_webpage(url, item_id)
|
||||||
channel_data = ssr_data['exportResponse']
|
redirect = self._search_json(
|
||||||
|
r'var it\s*=', webpage, 'redirect', item_id, default={}).get('retpath')
|
||||||
|
if redirect:
|
||||||
|
item_id = self._match_id(redirect)
|
||||||
|
webpage = self._download_webpage(redirect, item_id, note='Redirecting')
|
||||||
|
data = self._search_json(
|
||||||
|
r'("data"\s*:|data\s*=)', webpage, 'channel data', item_id, contains_pattern=r'{\"__serverState__.+}')
|
||||||
|
server_state_json = traverse_obj(data, lambda k, _: k.startswith('__serverState__'), get_all=False)
|
||||||
|
server_settings_json = traverse_obj(data, lambda k, _: k.startswith('__serverSettings__'), get_all=False)
|
||||||
|
|
||||||
return self.playlist_result(
|
return self.playlist_result(
|
||||||
self._entries(channel_data['feedData'], channel_id),
|
self._entries(item_id, server_state_json, server_settings_json),
|
||||||
channel_id, **traverse_obj(channel_data, ('channel', 'source', {
|
item_id, traverse_obj(server_state_json, ('channel', 'source', 'title')),
|
||||||
'title': ('title', {str}),
|
traverse_obj(server_state_json, ('channel', 'source', 'description')))
|
||||||
'description': ('description', {str}),
|
|
||||||
})))
|
|
||||||
|
@ -803,14 +803,12 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
|||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def _extract_continuation_ep_data(cls, continuation_ep: dict):
|
def _extract_continuation_ep_data(cls, continuation_ep: dict):
|
||||||
continuation_commands = traverse_obj(
|
if isinstance(continuation_ep, dict):
|
||||||
continuation_ep, ('commandExecutorCommand', 'commands', ..., {dict}))
|
continuation = try_get(
|
||||||
continuation_commands.append(continuation_ep)
|
continuation_ep, lambda x: x['continuationCommand']['token'], str)
|
||||||
for command in continuation_commands:
|
|
||||||
continuation = traverse_obj(command, ('continuationCommand', 'token', {str}))
|
|
||||||
if not continuation:
|
if not continuation:
|
||||||
continue
|
return
|
||||||
ctp = command.get('clickTrackingParams')
|
ctp = continuation_ep.get('clickTrackingParams')
|
||||||
return cls._build_api_continuation_query(continuation, ctp)
|
return cls._build_api_continuation_query(continuation, ctp)
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
|
@ -524,16 +524,10 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
|||||||
response = self._extract_response(
|
response = self._extract_response(
|
||||||
item_id=f'{item_id} page {page_num}',
|
item_id=f'{item_id} page {page_num}',
|
||||||
query=continuation, headers=headers, ytcfg=ytcfg,
|
query=continuation, headers=headers, ytcfg=ytcfg,
|
||||||
check_get_keys=(
|
check_get_keys=('continuationContents', 'onResponseReceivedActions', 'onResponseReceivedEndpoints'))
|
||||||
'continuationContents', 'onResponseReceivedActions', 'onResponseReceivedEndpoints',
|
|
||||||
# Playlist recommendations may return with no data - ignore
|
|
||||||
('responseContext', 'serviceTrackingParams', ..., 'params', ..., lambda k, v: k == 'key' and v == 'GetRecommendedMusicPlaylists_rid'),
|
|
||||||
))
|
|
||||||
|
|
||||||
if not response:
|
if not response:
|
||||||
break
|
break
|
||||||
|
|
||||||
continuation = None
|
|
||||||
# Extracting updated visitor data is required to prevent an infinite extraction loop in some cases
|
# Extracting updated visitor data is required to prevent an infinite extraction loop in some cases
|
||||||
# See: https://github.com/ytdl-org/youtube-dl/issues/28702
|
# See: https://github.com/ytdl-org/youtube-dl/issues/28702
|
||||||
visitor_data = self._extract_visitor_data(response) or visitor_data
|
visitor_data = self._extract_visitor_data(response) or visitor_data
|
||||||
@ -570,13 +564,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
|||||||
yield from func(video_items_renderer)
|
yield from func(video_items_renderer)
|
||||||
continuation = continuation_list[0] or self._extract_continuation(video_items_renderer)
|
continuation = continuation_list[0] or self._extract_continuation(video_items_renderer)
|
||||||
|
|
||||||
# In the case only a continuation is returned, try to follow it.
|
if not video_items_renderer:
|
||||||
# We extract this after trying to extract non-continuation items as otherwise this
|
|
||||||
# may be prioritized over other continuations.
|
|
||||||
# see: https://github.com/yt-dlp/yt-dlp/issues/12933
|
|
||||||
continuation = continuation or self._extract_continuation({'contents': [continuation_item]})
|
|
||||||
|
|
||||||
if not continuation and not video_items_renderer:
|
|
||||||
break
|
break
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@ -1011,14 +999,14 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_mincount': 94,
|
'playlist_mincount': 94,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'title': 'Igor Kleiner - Playlists',
|
'title': 'Igor Kleiner Ph.D. - Playlists',
|
||||||
'description': 'md5:15d7dd9e333cb987907fcb0d604b233a',
|
'description': 'md5:15d7dd9e333cb987907fcb0d604b233a',
|
||||||
'uploader': 'Igor Kleiner ',
|
'uploader': 'Igor Kleiner Ph.D.',
|
||||||
'uploader_id': '@IgorDataScience',
|
'uploader_id': '@IgorDataScience',
|
||||||
'uploader_url': 'https://www.youtube.com/@IgorDataScience',
|
'uploader_url': 'https://www.youtube.com/@IgorDataScience',
|
||||||
'channel': 'Igor Kleiner ',
|
'channel': 'Igor Kleiner Ph.D.',
|
||||||
'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'tags': 'count:23',
|
'tags': ['критическое мышление', 'наука просто', 'математика', 'анализ данных'],
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg',
|
'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'channel_follower_count': int,
|
'channel_follower_count': int,
|
||||||
},
|
},
|
||||||
@ -1028,19 +1016,18 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_mincount': 94,
|
'playlist_mincount': 94,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
'id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'title': 'Igor Kleiner - Playlists',
|
'title': 'Igor Kleiner Ph.D. - Playlists',
|
||||||
'description': 'md5:15d7dd9e333cb987907fcb0d604b233a',
|
'description': 'md5:15d7dd9e333cb987907fcb0d604b233a',
|
||||||
'uploader': 'Igor Kleiner ',
|
'uploader': 'Igor Kleiner Ph.D.',
|
||||||
'uploader_id': '@IgorDataScience',
|
'uploader_id': '@IgorDataScience',
|
||||||
'uploader_url': 'https://www.youtube.com/@IgorDataScience',
|
'uploader_url': 'https://www.youtube.com/@IgorDataScience',
|
||||||
'tags': 'count:23',
|
'tags': ['критическое мышление', 'наука просто', 'математика', 'анализ данных'],
|
||||||
'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'channel': 'Igor Kleiner ',
|
'channel': 'Igor Kleiner Ph.D.',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg',
|
'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg',
|
||||||
'channel_follower_count': int,
|
'channel_follower_count': int,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix channel_is_verified extraction
|
|
||||||
'note': 'playlists, series',
|
'note': 'playlists, series',
|
||||||
'url': 'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3',
|
'url': 'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3',
|
||||||
'playlist_mincount': 5,
|
'playlist_mincount': 5,
|
||||||
@ -1079,23 +1066,22 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
|
'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix availability extraction
|
|
||||||
'note': 'basic, single video playlist',
|
'note': 'basic, single video playlist',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlSLRHmI1qNm0wjyVNWw1pCU',
|
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'PLt5yu3-wZAlSLRHmI1qNm0wjyVNWw1pCU',
|
'id': 'PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
|
||||||
'title': 'single video playlist',
|
'title': 'youtube-dl public playlist',
|
||||||
'description': '',
|
'description': '',
|
||||||
'tags': [],
|
'tags': [],
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'modified_date': '20250417',
|
'modified_date': '20201130',
|
||||||
'channel': 'cole-dlp-test-acc',
|
'channel': 'Sergey M.',
|
||||||
'channel_id': 'UCiu-3thuViMebBjw_5nWYrA',
|
'channel_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA',
|
'channel_url': 'https://www.youtube.com/channel/UCmlqkdCBesrv2Lak1mF_MxA',
|
||||||
'availability': 'public',
|
'availability': 'public',
|
||||||
'uploader': 'cole-dlp-test-acc',
|
'uploader': 'Sergey M.',
|
||||||
'uploader_url': 'https://www.youtube.com/@coletdjnz',
|
'uploader_url': 'https://www.youtube.com/@sergeym.6173',
|
||||||
'uploader_id': '@coletdjnz',
|
'uploader_id': '@sergeym.6173',
|
||||||
},
|
},
|
||||||
'playlist_count': 1,
|
'playlist_count': 1,
|
||||||
}, {
|
}, {
|
||||||
@ -1185,11 +1171,11 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
'playlist_mincount': 17,
|
'playlist_mincount': 17,
|
||||||
}, {
|
}, {
|
||||||
'note': 'Posts tab',
|
'note': 'Community tab',
|
||||||
'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w/community',
|
'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w/community',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
'title': 'lex will - Posts',
|
'title': 'lex will - Community',
|
||||||
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
'channel': 'lex will',
|
'channel': 'lex will',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w',
|
'channel_url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
@ -1202,14 +1188,30 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
'playlist_mincount': 18,
|
'playlist_mincount': 18,
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix channel_is_verified extraction
|
'note': 'Channels tab',
|
||||||
|
'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w/channels',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
|
'title': 'lex will - Channels',
|
||||||
|
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
|
||||||
|
'channel': 'lex will',
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
|
'channel_id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
|
||||||
|
'tags': ['bible', 'history', 'prophesy'],
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'uploader_url': 'https://www.youtube.com/@lexwill718',
|
||||||
|
'uploader_id': '@lexwill718',
|
||||||
|
'uploader': 'lex will',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 12,
|
||||||
|
}, {
|
||||||
'note': 'Search tab',
|
'note': 'Search tab',
|
||||||
'url': 'https://www.youtube.com/c/3blue1brown/search?query=linear%20algebra',
|
'url': 'https://www.youtube.com/c/3blue1brown/search?query=linear%20algebra',
|
||||||
'playlist_mincount': 40,
|
'playlist_mincount': 40,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCYO_jab_esuFRV4b17AJtAw',
|
'id': 'UCYO_jab_esuFRV4b17AJtAw',
|
||||||
'title': '3Blue1Brown - Search - linear algebra',
|
'title': '3Blue1Brown - Search - linear algebra',
|
||||||
'description': 'md5:602e3789e6a0cb7d9d352186b720e395',
|
'description': 'md5:4d1da95432004b7ba840ebc895b6b4c9',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
|
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
|
||||||
'tags': ['Mathematics'],
|
'tags': ['Mathematics'],
|
||||||
'channel': '3Blue1Brown',
|
'channel': '3Blue1Brown',
|
||||||
@ -1230,7 +1232,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'url': 'https://music.youtube.com/channel/UCmlqkdCBesrv2Lak1mF_MxA',
|
'url': 'https://music.youtube.com/channel/UCmlqkdCBesrv2Lak1mF_MxA',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix availability extraction
|
|
||||||
'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
|
'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
|
'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1293,25 +1294,24 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
'playlist_mincount': 21,
|
'playlist_mincount': 21,
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix availability extraction
|
|
||||||
'note': 'Playlist with "show unavailable videos" button',
|
'note': 'Playlist with "show unavailable videos" button',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLYwq8WOe86_xGmR7FrcJq8Sb7VW8K3Tt2',
|
'url': 'https://www.youtube.com/playlist?list=UUTYLiWFZy8xtPwxFwX9rV7Q',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'The Memes Of 2010s.....',
|
'title': 'Uploads from Phim Siêu Nhân Nhật Bản',
|
||||||
'id': 'PLYwq8WOe86_xGmR7FrcJq8Sb7VW8K3Tt2',
|
'id': 'UUTYLiWFZy8xtPwxFwX9rV7Q',
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'channel': "I'm Not JiNxEd",
|
'channel': 'Phim Siêu Nhân Nhật Bản',
|
||||||
'tags': [],
|
'tags': [],
|
||||||
'description': 'md5:44dc3b315ba69394feaafa2f40e7b2a1',
|
'description': '',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UC5H5H85D1QE5-fuWWQ1hdNg',
|
'channel_url': 'https://www.youtube.com/channel/UCTYLiWFZy8xtPwxFwX9rV7Q',
|
||||||
'channel_id': 'UC5H5H85D1QE5-fuWWQ1hdNg',
|
'channel_id': 'UCTYLiWFZy8xtPwxFwX9rV7Q',
|
||||||
'modified_date': r're:\d{8}',
|
'modified_date': r're:\d{8}',
|
||||||
'availability': 'public',
|
'availability': 'public',
|
||||||
'uploader_url': 'https://www.youtube.com/@imnotjinxed1998',
|
'uploader_url': 'https://www.youtube.com/@phimsieunhannhatban',
|
||||||
'uploader_id': '@imnotjinxed1998',
|
'uploader_id': '@phimsieunhannhatban',
|
||||||
'uploader': "I'm Not JiNxEd",
|
'uploader': 'Phim Siêu Nhân Nhật Bản',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 150,
|
'playlist_mincount': 200,
|
||||||
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
|
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
|
||||||
}, {
|
}, {
|
||||||
'note': 'Playlist with unavailable videos in page 7',
|
'note': 'Playlist with unavailable videos in page 7',
|
||||||
@ -1334,7 +1334,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_mincount': 1000,
|
'playlist_mincount': 1000,
|
||||||
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
|
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix availability extraction
|
|
||||||
'note': 'https://github.com/ytdl-org/youtube-dl/issues/21844',
|
'note': 'https://github.com/ytdl-org/youtube-dl/issues/21844',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba',
|
'url': 'https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1385,7 +1384,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'https://www.youtube.com/channel/UCoMdktPbSTixAyNGwb-UYkQ/live',
|
'url': 'https://www.youtube.com/channel/UCoMdktPbSTixAyNGwb-UYkQ/live',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'YDvsBbKfLPA', # This will keep changing
|
'id': 'hGkQjiJLjWQ', # This will keep changing
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': str,
|
'title': str,
|
||||||
'upload_date': r're:\d{8}',
|
'upload_date': r're:\d{8}',
|
||||||
@ -1410,8 +1409,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'uploader_id': '@SkyNews',
|
'uploader_id': '@SkyNews',
|
||||||
'uploader': 'Sky News',
|
'uploader': 'Sky News',
|
||||||
'channel_is_verified': True,
|
'channel_is_verified': True,
|
||||||
'media_type': 'livestream',
|
|
||||||
'timestamp': int,
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -1499,7 +1496,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'url': 'https://music.youtube.com/browse/UC1a8OFewdjuLq6KlF8M_8Ng',
|
'url': 'https://music.youtube.com/browse/UC1a8OFewdjuLq6KlF8M_8Ng',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix availability extraction
|
|
||||||
'note': 'VLPL, should redirect to playlist?list=PL...',
|
'note': 'VLPL, should redirect to playlist?list=PL...',
|
||||||
'url': 'https://music.youtube.com/browse/VLPLRBp0Fe2GpgmgoscNFLxNyBVSFVdYmFkq',
|
'url': 'https://music.youtube.com/browse/VLPLRBp0Fe2GpgmgoscNFLxNyBVSFVdYmFkq',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1541,7 +1537,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
# Destination channel with only a hidden self tab (tab id is UCtFRv9O2AHqOZjjynzrv-xg)
|
# Destination channel with only a hidden self tab (tab id is UCtFRv9O2AHqOZjjynzrv-xg)
|
||||||
# Treat as a general feed
|
# Treat as a general feed
|
||||||
# TODO: fix extraction
|
|
||||||
'url': 'https://www.youtube.com/channel/UCtFRv9O2AHqOZjjynzrv-xg',
|
'url': 'https://www.youtube.com/channel/UCtFRv9O2AHqOZjjynzrv-xg',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCtFRv9O2AHqOZjjynzrv-xg',
|
'id': 'UCtFRv9O2AHqOZjjynzrv-xg',
|
||||||
@ -1565,21 +1560,21 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'expected_warnings': ['YouTube Music is not directly supported'],
|
'expected_warnings': ['YouTube Music is not directly supported'],
|
||||||
}, {
|
}, {
|
||||||
'note': 'unlisted single video playlist',
|
'note': 'unlisted single video playlist',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlQLfIN0MMgp0wVV6MP3bM4_',
|
'url': 'https://www.youtube.com/playlist?list=PLwL24UFy54GrB3s2KMMfjZscDi1x5Dajf',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'PLt5yu3-wZAlQLfIN0MMgp0wVV6MP3bM4_',
|
'id': 'PLwL24UFy54GrB3s2KMMfjZscDi1x5Dajf',
|
||||||
'title': 'unlisted playlist',
|
'title': 'yt-dlp unlisted playlist test',
|
||||||
'availability': 'unlisted',
|
'availability': 'unlisted',
|
||||||
'tags': [],
|
'tags': [],
|
||||||
'modified_date': '20250417',
|
'modified_date': '20220418',
|
||||||
'channel': 'cole-dlp-test-acc',
|
'channel': 'colethedj',
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'description': '',
|
'description': '',
|
||||||
'channel_id': 'UCiu-3thuViMebBjw_5nWYrA',
|
'channel_id': 'UC9zHu_mHU96r19o-wV5Qs1Q',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA',
|
'channel_url': 'https://www.youtube.com/channel/UC9zHu_mHU96r19o-wV5Qs1Q',
|
||||||
'uploader_url': 'https://www.youtube.com/@coletdjnz',
|
'uploader_url': 'https://www.youtube.com/@colethedj1894',
|
||||||
'uploader_id': '@coletdjnz',
|
'uploader_id': '@colethedj1894',
|
||||||
'uploader': 'cole-dlp-test-acc',
|
'uploader': 'colethedj',
|
||||||
},
|
},
|
||||||
'playlist': [{
|
'playlist': [{
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1601,7 +1596,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_count': 1,
|
'playlist_count': 1,
|
||||||
'params': {'extract_flat': True},
|
'params': {'extract_flat': True},
|
||||||
}, {
|
}, {
|
||||||
# By default, recommended is always empty.
|
|
||||||
'note': 'API Fallback: Recommended - redirects to home page. Requires visitorData',
|
'note': 'API Fallback: Recommended - redirects to home page. Requires visitorData',
|
||||||
'url': 'https://www.youtube.com/feed/recommended',
|
'url': 'https://www.youtube.com/feed/recommended',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1609,7 +1603,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'title': 'recommended',
|
'title': 'recommended',
|
||||||
'tags': [],
|
'tags': [],
|
||||||
},
|
},
|
||||||
'playlist_count': 0,
|
'playlist_mincount': 50,
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
'extractor_args': {'youtubetab': {'skip': ['webpage']}},
|
'extractor_args': {'youtubetab': {'skip': ['webpage']}},
|
||||||
@ -1634,7 +1628,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
'skip': 'Query for sorting no longer works',
|
'skip': 'Query for sorting no longer works',
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix 'unviewable' issue with this playlist when reloading with unavailable videos
|
|
||||||
'note': 'API Fallback: Topic, should redirect to playlist?list=UU...',
|
'note': 'API Fallback: Topic, should redirect to playlist?list=UU...',
|
||||||
'url': 'https://music.youtube.com/browse/UC9ALqqC4aIeG5iDs7i90Bfw',
|
'url': 'https://music.youtube.com/browse/UC9ALqqC4aIeG5iDs7i90Bfw',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1661,12 +1654,11 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'url': 'https://www.youtube.com/channel/UCwVVpHQ2Cs9iGJfpdFngePQ',
|
'url': 'https://www.youtube.com/channel/UCwVVpHQ2Cs9iGJfpdFngePQ',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix metadata extraction
|
|
||||||
'note': 'collaborative playlist (uploader name in the form "by <uploader> and x other(s)")',
|
'note': 'collaborative playlist (uploader name in the form "by <uploader> and x other(s)")',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6',
|
'url': 'https://www.youtube.com/playlist?list=PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6',
|
'id': 'PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6',
|
||||||
'modified_date': '20250115',
|
'modified_date': '20220407',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCKcqXmCcyqnhgpA5P0oHH_Q',
|
'channel_url': 'https://www.youtube.com/channel/UCKcqXmCcyqnhgpA5P0oHH_Q',
|
||||||
'tags': [],
|
'tags': [],
|
||||||
'availability': 'unlisted',
|
'availability': 'unlisted',
|
||||||
@ -1700,7 +1692,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'expected_warnings': ['Preferring "ja"'],
|
'expected_warnings': ['Preferring "ja"'],
|
||||||
}, {
|
}, {
|
||||||
# XXX: this should really check flat playlist entries, but the test suite doesn't support that
|
# XXX: this should really check flat playlist entries, but the test suite doesn't support that
|
||||||
# TODO: fix availability extraction
|
|
||||||
'note': 'preferred lang set with playlist with translated video titles',
|
'note': 'preferred lang set with playlist with translated video titles',
|
||||||
'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlQAaPZ5Z-rJoTdbT-45Q7c0',
|
'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlQAaPZ5Z-rJoTdbT-45Q7c0',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -1723,7 +1714,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
# shorts audio pivot for 2GtVksBMYFM.
|
# shorts audio pivot for 2GtVksBMYFM.
|
||||||
'url': 'https://www.youtube.com/feed/sfv_audio_pivot?bp=8gUrCikSJwoLMkd0VmtzQk1ZRk0SCzJHdFZrc0JNWUZNGgsyR3RWa3NCTVlGTQ==',
|
'url': 'https://www.youtube.com/feed/sfv_audio_pivot?bp=8gUrCikSJwoLMkd0VmtzQk1ZRk0SCzJHdFZrc0JNWUZNGgsyR3RWa3NCTVlGTQ==',
|
||||||
# TODO: fix extraction
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'sfv_audio_pivot',
|
'id': 'sfv_audio_pivot',
|
||||||
'title': 'sfv_audio_pivot',
|
'title': 'sfv_audio_pivot',
|
||||||
@ -1761,7 +1751,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_mincount': 8,
|
'playlist_mincount': 8,
|
||||||
}, {
|
}, {
|
||||||
# Should get three playlists for videos, shorts and streams tabs
|
# Should get three playlists for videos, shorts and streams tabs
|
||||||
# TODO: fix channel_is_verified extraction
|
|
||||||
'url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA',
|
'url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCK9V2B22uJYu3N7eR_BT9QA',
|
'id': 'UCK9V2B22uJYu3N7eR_BT9QA',
|
||||||
@ -1769,7 +1758,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'channel_follower_count': int,
|
'channel_follower_count': int,
|
||||||
'channel_id': 'UCK9V2B22uJYu3N7eR_BT9QA',
|
'channel_id': 'UCK9V2B22uJYu3N7eR_BT9QA',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA',
|
'channel_url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA',
|
||||||
'description': 'md5:01e53f350ab8ad6fcf7c4fedb3c1b99f',
|
'description': 'md5:49809d8bf9da539bc48ed5d1f83c33f2',
|
||||||
'channel': 'Polka Ch. 尾丸ポルカ',
|
'channel': 'Polka Ch. 尾丸ポルカ',
|
||||||
'tags': 'count:35',
|
'tags': 'count:35',
|
||||||
'uploader_url': 'https://www.youtube.com/@OmaruPolka',
|
'uploader_url': 'https://www.youtube.com/@OmaruPolka',
|
||||||
@ -1780,14 +1769,14 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_count': 3,
|
'playlist_count': 3,
|
||||||
}, {
|
}, {
|
||||||
# Shorts tab with channel with handle
|
# Shorts tab with channel with handle
|
||||||
# TODO: fix channel_is_verified extraction
|
# TODO: fix channel description
|
||||||
'url': 'https://www.youtube.com/@NotJustBikes/shorts',
|
'url': 'https://www.youtube.com/@NotJustBikes/shorts',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UC0intLFzLaudFG-xAvUEO-A',
|
'id': 'UC0intLFzLaudFG-xAvUEO-A',
|
||||||
'title': 'Not Just Bikes - Shorts',
|
'title': 'Not Just Bikes - Shorts',
|
||||||
'tags': 'count:10',
|
'tags': 'count:10',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UC0intLFzLaudFG-xAvUEO-A',
|
'channel_url': 'https://www.youtube.com/channel/UC0intLFzLaudFG-xAvUEO-A',
|
||||||
'description': 'md5:1d9fc1bad7f13a487299d1fe1712e031',
|
'description': 'md5:5e82545b3a041345927a92d0585df247',
|
||||||
'channel_follower_count': int,
|
'channel_follower_count': int,
|
||||||
'channel_id': 'UC0intLFzLaudFG-xAvUEO-A',
|
'channel_id': 'UC0intLFzLaudFG-xAvUEO-A',
|
||||||
'channel': 'Not Just Bikes',
|
'channel': 'Not Just Bikes',
|
||||||
@ -1808,7 +1797,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'channel_url': 'https://www.youtube.com/channel/UC3eYAvjCVwNHgkaGbXX3sig',
|
'channel_url': 'https://www.youtube.com/channel/UC3eYAvjCVwNHgkaGbXX3sig',
|
||||||
'channel': '中村悠一',
|
'channel': '中村悠一',
|
||||||
'channel_follower_count': int,
|
'channel_follower_count': int,
|
||||||
'description': 'md5:e8fd705073a594f27d6d6d020da560dc',
|
'description': 'md5:e744f6c93dafa7a03c0c6deecb157300',
|
||||||
'uploader_url': 'https://www.youtube.com/@Yuichi-Nakamura',
|
'uploader_url': 'https://www.youtube.com/@Yuichi-Nakamura',
|
||||||
'uploader_id': '@Yuichi-Nakamura',
|
'uploader_id': '@Yuichi-Nakamura',
|
||||||
'uploader': '中村悠一',
|
'uploader': '中村悠一',
|
||||||
@ -1826,7 +1815,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
# No videos tab but has a shorts tab
|
# No videos tab but has a shorts tab
|
||||||
# TODO: fix metadata extraction
|
|
||||||
'url': 'https://www.youtube.com/c/TKFShorts',
|
'url': 'https://www.youtube.com/c/TKFShorts',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCgJ5_1F6yJhYLnyMszUdmUg',
|
'id': 'UCgJ5_1F6yJhYLnyMszUdmUg',
|
||||||
@ -1863,7 +1851,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
# Shorts url result in shorts tab
|
# Shorts url result in shorts tab
|
||||||
# TODO: Fix channel id extraction
|
# TODO: Fix channel id extraction
|
||||||
# TODO: fix test suite, 208163447408c78673b08c172beafe5c310fb167 broke this test
|
|
||||||
'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/shorts',
|
'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/shorts',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCiu-3thuViMebBjw_5nWYrA',
|
'id': 'UCiu-3thuViMebBjw_5nWYrA',
|
||||||
@ -1892,7 +1879,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'params': {'extract_flat': True},
|
'params': {'extract_flat': True},
|
||||||
}, {
|
}, {
|
||||||
# Live video status should be extracted
|
# Live video status should be extracted
|
||||||
# TODO: fix test suite, 208163447408c78673b08c172beafe5c310fb167 broke this test
|
|
||||||
'url': 'https://www.youtube.com/channel/UCQvWX73GQygcwXOTSf_VDVg/live',
|
'url': 'https://www.youtube.com/channel/UCQvWX73GQygcwXOTSf_VDVg/live',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCQvWX73GQygcwXOTSf_VDVg',
|
'id': 'UCQvWX73GQygcwXOTSf_VDVg',
|
||||||
@ -1921,7 +1907,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_mincount': 1,
|
'playlist_mincount': 1,
|
||||||
}, {
|
}, {
|
||||||
# Channel renderer metadata. Contains number of videos on the channel
|
# Channel renderer metadata. Contains number of videos on the channel
|
||||||
# TODO: channels tab removed, change this test to use another page with channel renderer
|
|
||||||
'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/channels',
|
'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/channels',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCiu-3thuViMebBjw_5nWYrA',
|
'id': 'UCiu-3thuViMebBjw_5nWYrA',
|
||||||
@ -1955,9 +1940,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
}],
|
}],
|
||||||
'params': {'extract_flat': True},
|
'params': {'extract_flat': True},
|
||||||
'skip': 'channels tab removed',
|
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix channel_is_verified extraction
|
|
||||||
'url': 'https://www.youtube.com/@3blue1brown/about',
|
'url': 'https://www.youtube.com/@3blue1brown/about',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '@3blue1brown',
|
'id': '@3blue1brown',
|
||||||
@ -1967,7 +1950,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'channel_id': 'UCYO_jab_esuFRV4b17AJtAw',
|
'channel_id': 'UCYO_jab_esuFRV4b17AJtAw',
|
||||||
'channel': '3Blue1Brown',
|
'channel': '3Blue1Brown',
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
|
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
|
||||||
'description': 'md5:602e3789e6a0cb7d9d352186b720e395',
|
'description': 'md5:4d1da95432004b7ba840ebc895b6b4c9',
|
||||||
'uploader_url': 'https://www.youtube.com/@3blue1brown',
|
'uploader_url': 'https://www.youtube.com/@3blue1brown',
|
||||||
'uploader_id': '@3blue1brown',
|
'uploader_id': '@3blue1brown',
|
||||||
'uploader': '3Blue1Brown',
|
'uploader': '3Blue1Brown',
|
||||||
@ -1993,7 +1976,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_count': 5,
|
'playlist_count': 5,
|
||||||
}, {
|
}, {
|
||||||
# Releases tab, with rich entry playlistRenderers (same as Podcasts tab)
|
# Releases tab, with rich entry playlistRenderers (same as Podcasts tab)
|
||||||
# TODO: fix channel_is_verified extraction
|
|
||||||
'url': 'https://www.youtube.com/@AHimitsu/releases',
|
'url': 'https://www.youtube.com/@AHimitsu/releases',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'UCgFwu-j5-xNJml2FtTrrB3A',
|
'id': 'UCgFwu-j5-xNJml2FtTrrB3A',
|
||||||
@ -2033,7 +2015,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'playlist_mincount': 100,
|
'playlist_mincount': 100,
|
||||||
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
|
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
|
||||||
}, {
|
}, {
|
||||||
# TODO: fix channel_is_verified extraction
|
|
||||||
'note': 'Tags containing spaces',
|
'note': 'Tags containing spaces',
|
||||||
'url': 'https://www.youtube.com/channel/UC7_YxT-KID8kRbqZo7MyscQ',
|
'url': 'https://www.youtube.com/channel/UC7_YxT-KID8kRbqZo7MyscQ',
|
||||||
'playlist_count': 3,
|
'playlist_count': 3,
|
||||||
@ -2054,24 +2035,6 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
|
|||||||
'challenges', 'sketches', 'scary games', 'funny games', 'rage games',
|
'challenges', 'sketches', 'scary games', 'funny games', 'rage games',
|
||||||
'mark fischbach'],
|
'mark fischbach'],
|
||||||
},
|
},
|
||||||
}, {
|
|
||||||
# https://github.com/yt-dlp/yt-dlp/issues/12933
|
|
||||||
'note': 'streams tab, some scheduled streams. Empty intermediate response with only continuation - must follow',
|
|
||||||
'url': 'https://www.youtube.com/@sbcitygov/streams',
|
|
||||||
'playlist_mincount': 150,
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'UCH6-qfQwlUgz9SAf05jvc_w',
|
|
||||||
'channel': 'sbcitygov',
|
|
||||||
'channel_id': 'UCH6-qfQwlUgz9SAf05jvc_w',
|
|
||||||
'title': 'sbcitygov - Live',
|
|
||||||
'channel_follower_count': int,
|
|
||||||
'description': 'md5:ca1a92059835c071e33b3db52f4a6d67',
|
|
||||||
'uploader_id': '@sbcitygov',
|
|
||||||
'uploader_url': 'https://www.youtube.com/@sbcitygov',
|
|
||||||
'uploader': 'sbcitygov',
|
|
||||||
'channel_url': 'https://www.youtube.com/channel/UCH6-qfQwlUgz9SAf05jvc_w',
|
|
||||||
'tags': [],
|
|
||||||
},
|
|
||||||
}]
|
}]
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
|
@ -34,7 +34,6 @@ from ...utils import (
|
|||||||
clean_html,
|
clean_html,
|
||||||
datetime_from_str,
|
datetime_from_str,
|
||||||
filesize_from_tbr,
|
filesize_from_tbr,
|
||||||
filter_dict,
|
|
||||||
float_or_none,
|
float_or_none,
|
||||||
format_field,
|
format_field,
|
||||||
get_first,
|
get_first,
|
||||||
@ -1761,16 +1760,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
|
||||||
_PLAYER_JS_VARIANT_MAP = {
|
|
||||||
'main': 'player_ias.vflset/en_US/base.js',
|
|
||||||
'tce': 'player_ias_tce.vflset/en_US/base.js',
|
|
||||||
'tv': 'tv-player-ias.vflset/tv-player-ias.js',
|
|
||||||
'tv_es6': 'tv-player-es6.vflset/tv-player-es6.js',
|
|
||||||
'phone': 'player-plasma-ias-phone-en_US.vflset/base.js',
|
|
||||||
'tablet': 'player-plasma-ias-tablet-en_US.vflset/base.js',
|
|
||||||
}
|
|
||||||
_INVERSE_PLAYER_JS_VARIANT_MAP = {v: k for k, v in _PLAYER_JS_VARIANT_MAP.items()}
|
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def suitable(cls, url):
|
def suitable(cls, url):
|
||||||
from yt_dlp.utils import parse_qs
|
from yt_dlp.utils import parse_qs
|
||||||
@ -1950,21 +1939,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
get_all=False, expected_type=str)
|
get_all=False, expected_type=str)
|
||||||
if not player_url:
|
if not player_url:
|
||||||
return
|
return
|
||||||
|
|
||||||
requested_js_variant = self._configuration_arg('player_js_variant', [''])[0] or 'actual'
|
|
||||||
if requested_js_variant in self._PLAYER_JS_VARIANT_MAP:
|
|
||||||
player_id = self._extract_player_info(player_url)
|
|
||||||
original_url = player_url
|
|
||||||
player_url = f'/s/player/{player_id}/{self._PLAYER_JS_VARIANT_MAP[requested_js_variant]}'
|
|
||||||
if original_url != player_url:
|
|
||||||
self.write_debug(
|
|
||||||
f'Forcing "{requested_js_variant}" player JS variant for player {player_id}\n'
|
|
||||||
f' original url = {original_url}', only_once=True)
|
|
||||||
elif requested_js_variant != 'actual':
|
|
||||||
self.report_warning(
|
|
||||||
f'Invalid player JS variant name "{requested_js_variant}" requested. '
|
|
||||||
f'Valid choices are: {", ".join(self._PLAYER_JS_VARIANT_MAP)}', only_once=True)
|
|
||||||
|
|
||||||
return urljoin('https://www.youtube.com', player_url)
|
return urljoin('https://www.youtube.com', player_url)
|
||||||
|
|
||||||
def _download_player_url(self, video_id, fatal=False):
|
def _download_player_url(self, video_id, fatal=False):
|
||||||
@ -1979,17 +1953,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if player_version:
|
if player_version:
|
||||||
return f'https://www.youtube.com/s/player/{player_version}/player_ias.vflset/en_US/base.js'
|
return f'https://www.youtube.com/s/player/{player_version}/player_ias.vflset/en_US/base.js'
|
||||||
|
|
||||||
def _player_js_cache_key(self, player_url):
|
|
||||||
player_id = self._extract_player_info(player_url)
|
|
||||||
player_path = remove_start(urllib.parse.urlparse(player_url).path, f'/s/player/{player_id}/')
|
|
||||||
variant = self._INVERSE_PLAYER_JS_VARIANT_MAP.get(player_path)
|
|
||||||
if not variant:
|
|
||||||
self.write_debug(
|
|
||||||
f'Unable to determine player JS variant\n'
|
|
||||||
f' player = {player_url}', only_once=True)
|
|
||||||
variant = re.sub(r'[^a-zA-Z0-9]', '_', remove_end(player_path, '.js'))
|
|
||||||
return join_nonempty(player_id, variant)
|
|
||||||
|
|
||||||
def _signature_cache_id(self, example_sig):
|
def _signature_cache_id(self, example_sig):
|
||||||
""" Return a string representation of a signature """
|
""" Return a string representation of a signature """
|
||||||
return '.'.join(str(len(part)) for part in example_sig.split('.'))
|
return '.'.join(str(len(part)) for part in example_sig.split('.'))
|
||||||
@ -2005,29 +1968,30 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
return id_m.group('id')
|
return id_m.group('id')
|
||||||
|
|
||||||
def _load_player(self, video_id, player_url, fatal=True):
|
def _load_player(self, video_id, player_url, fatal=True):
|
||||||
player_js_key = self._player_js_cache_key(player_url)
|
player_id = self._extract_player_info(player_url)
|
||||||
if player_js_key not in self._code_cache:
|
if player_id not in self._code_cache:
|
||||||
code = self._download_webpage(
|
code = self._download_webpage(
|
||||||
player_url, video_id, fatal=fatal,
|
player_url, video_id, fatal=fatal,
|
||||||
note=f'Downloading player {player_js_key}',
|
note='Downloading player ' + player_id,
|
||||||
errnote=f'Download of {player_js_key} failed')
|
errnote=f'Download of {player_url} failed')
|
||||||
if code:
|
if code:
|
||||||
self._code_cache[player_js_key] = code
|
self._code_cache[player_id] = code
|
||||||
return self._code_cache.get(player_js_key)
|
return self._code_cache.get(player_id)
|
||||||
|
|
||||||
def _extract_signature_function(self, video_id, player_url, example_sig):
|
def _extract_signature_function(self, video_id, player_url, example_sig):
|
||||||
|
player_id = self._extract_player_info(player_url)
|
||||||
|
|
||||||
# Read from filesystem cache
|
# Read from filesystem cache
|
||||||
func_id = join_nonempty(
|
func_id = f'js_{player_id}_{self._signature_cache_id(example_sig)}'
|
||||||
self._player_js_cache_key(player_url), self._signature_cache_id(example_sig))
|
|
||||||
assert os.path.basename(func_id) == func_id
|
assert os.path.basename(func_id) == func_id
|
||||||
|
|
||||||
self.write_debug(f'Extracting signature function {func_id}')
|
self.write_debug(f'Extracting signature function {func_id}')
|
||||||
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id, min_ver='2025.03.31'), None
|
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id), None
|
||||||
|
|
||||||
if not cache_spec:
|
if not cache_spec:
|
||||||
code = self._load_player(video_id, player_url)
|
code = self._load_player(video_id, player_url)
|
||||||
if code:
|
if code:
|
||||||
res = self._parse_sig_js(code, player_url)
|
res = self._parse_sig_js(code)
|
||||||
test_string = ''.join(map(chr, range(len(example_sig))))
|
test_string = ''.join(map(chr, range(len(example_sig))))
|
||||||
cache_spec = [ord(c) for c in res(test_string)]
|
cache_spec = [ord(c) for c in res(test_string)]
|
||||||
self.cache.store('youtube-sigfuncs', func_id, cache_spec)
|
self.cache.store('youtube-sigfuncs', func_id, cache_spec)
|
||||||
@ -2075,7 +2039,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
f' return {expr_code}\n')
|
f' return {expr_code}\n')
|
||||||
self.to_screen('Extracted signature function:\n' + code)
|
self.to_screen('Extracted signature function:\n' + code)
|
||||||
|
|
||||||
def _parse_sig_js(self, jscode, player_url):
|
def _parse_sig_js(self, jscode):
|
||||||
# Examples where `sig` is funcname:
|
# Examples where `sig` is funcname:
|
||||||
# sig=function(a){a=a.split(""); ... ;return a.join("")};
|
# sig=function(a){a=a.split(""); ... ;return a.join("")};
|
||||||
# ;c&&(c=sig(decodeURIComponent(c)),a.set(b,encodeURIComponent(c)));return a};
|
# ;c&&(c=sig(decodeURIComponent(c)),a.set(b,encodeURIComponent(c)));return a};
|
||||||
@ -2099,9 +2063,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('),
|
r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('),
|
||||||
jscode, 'Initial JS player signature function name', group='sig')
|
jscode, 'Initial JS player signature function name', group='sig')
|
||||||
|
|
||||||
varname, global_list = self._interpret_player_js_global_var(jscode, player_url)
|
|
||||||
jsi = JSInterpreter(jscode)
|
jsi = JSInterpreter(jscode)
|
||||||
initial_function = jsi.extract_function(funcname, filter_dict({varname: global_list}))
|
global_var_map = {}
|
||||||
|
_, varname, value = self._extract_player_js_global_var(jscode)
|
||||||
|
if varname:
|
||||||
|
global_var_map[varname] = jsi.interpret_expression(value, {}, allow_recursion=100)
|
||||||
|
initial_function = jsi.extract_function(funcname, global_var_map)
|
||||||
return lambda s: initial_function([s])
|
return lambda s: initial_function([s])
|
||||||
|
|
||||||
def _cached(self, func, *cache_id):
|
def _cached(self, func, *cache_id):
|
||||||
@ -2120,24 +2087,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
return ret
|
return ret
|
||||||
return inner
|
return inner
|
||||||
|
|
||||||
def _load_nsig_code_from_cache(self, player_url):
|
|
||||||
cache_id = ('youtube-nsig', self._player_js_cache_key(player_url))
|
|
||||||
|
|
||||||
if func_code := self._player_cache.get(cache_id):
|
|
||||||
return func_code
|
|
||||||
|
|
||||||
func_code = self.cache.load(*cache_id, min_ver='2025.03.31')
|
|
||||||
if func_code:
|
|
||||||
self._player_cache[cache_id] = func_code
|
|
||||||
|
|
||||||
return func_code
|
|
||||||
|
|
||||||
def _store_nsig_code_to_cache(self, player_url, func_code):
|
|
||||||
cache_id = ('youtube-nsig', self._player_js_cache_key(player_url))
|
|
||||||
if cache_id not in self._player_cache:
|
|
||||||
self.cache.store(*cache_id, func_code)
|
|
||||||
self._player_cache[cache_id] = func_code
|
|
||||||
|
|
||||||
def _decrypt_signature(self, s, video_id, player_url):
|
def _decrypt_signature(self, s, video_id, player_url):
|
||||||
"""Turn the encrypted s field into a working signature"""
|
"""Turn the encrypted s field into a working signature"""
|
||||||
extract_sig = self._cached(
|
extract_sig = self._cached(
|
||||||
@ -2178,31 +2127,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
video_id=video_id, note='Executing signature code').strip()
|
video_id=video_id, note='Executing signature code').strip()
|
||||||
|
|
||||||
self.write_debug(f'Decrypted nsig {s} => {ret}')
|
self.write_debug(f'Decrypted nsig {s} => {ret}')
|
||||||
# Only cache nsig func JS code to disk if successful, and only once
|
|
||||||
self._store_nsig_code_to_cache(player_url, func_code)
|
|
||||||
return ret
|
return ret
|
||||||
|
|
||||||
def _extract_n_function_name(self, jscode, player_url=None):
|
def _extract_n_function_name(self, jscode, player_url=None):
|
||||||
varname, global_list = self._interpret_player_js_global_var(jscode, player_url)
|
|
||||||
if debug_str := traverse_obj(global_list, (lambda _, v: v.endswith('_w8_'), any)):
|
|
||||||
funcname = self._search_regex(
|
|
||||||
r'''(?xs)
|
|
||||||
[;\n](?:
|
|
||||||
(?P<f>function\s+)|
|
|
||||||
(?:var\s+)?
|
|
||||||
)(?P<funcname>[a-zA-Z0-9_$]+)\s*(?(f)|=\s*function\s*)
|
|
||||||
\((?P<argname>[a-zA-Z0-9_$]+)\)\s*\{
|
|
||||||
(?:(?!\}[;\n]).)+
|
|
||||||
\}\s*catch\(\s*[a-zA-Z0-9_$]+\s*\)\s*
|
|
||||||
\{\s*return\s+%s\[%d\]\s*\+\s*(?P=argname)\s*\}\s*return\s+[^}]+\}[;\n]
|
|
||||||
''' % (re.escape(varname), global_list.index(debug_str)),
|
|
||||||
jscode, 'nsig function name', group='funcname', default=None)
|
|
||||||
if funcname:
|
|
||||||
return funcname
|
|
||||||
self.write_debug(join_nonempty(
|
|
||||||
'Initial search was unable to find nsig function name',
|
|
||||||
player_url and f' player = {player_url}', delim='\n'), only_once=True)
|
|
||||||
|
|
||||||
# Examples (with placeholders nfunc, narray, idx):
|
# Examples (with placeholders nfunc, narray, idx):
|
||||||
# * .get("n"))&&(b=nfunc(b)
|
# * .get("n"))&&(b=nfunc(b)
|
||||||
# * .get("n"))&&(b=narray[idx](b)
|
# * .get("n"))&&(b=narray[idx](b)
|
||||||
@ -2232,7 +2159,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if not funcname:
|
if not funcname:
|
||||||
self.report_warning(join_nonempty(
|
self.report_warning(join_nonempty(
|
||||||
'Falling back to generic n function search',
|
'Falling back to generic n function search',
|
||||||
player_url and f' player = {player_url}', delim='\n'), only_once=True)
|
player_url and f' player = {player_url}', delim='\n'))
|
||||||
return self._search_regex(
|
return self._search_regex(
|
||||||
r'''(?xs)
|
r'''(?xs)
|
||||||
;\s*(?P<name>[a-zA-Z0-9_$]+)\s*=\s*function\([a-zA-Z0-9_$]+\)
|
;\s*(?P<name>[a-zA-Z0-9_$]+)\s*=\s*function\([a-zA-Z0-9_$]+\)
|
||||||
@ -2245,10 +2172,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\])\s*[,;]', jscode,
|
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\])\s*[,;]', jscode,
|
||||||
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
|
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
|
||||||
|
|
||||||
def _extract_player_js_global_var(self, jscode, player_url):
|
def _extract_player_js_global_var(self, jscode):
|
||||||
"""Returns tuple of strings: variable assignment code, variable name, variable value code"""
|
"""Returns tuple of strings: variable assignment code, variable name, variable value code"""
|
||||||
extract_global_var = self._cached(self._search_regex, 'js global array', player_url)
|
return self._search_regex(
|
||||||
varcode, varname, varvalue = extract_global_var(
|
|
||||||
r'''(?x)
|
r'''(?x)
|
||||||
(?P<q1>["\'])use\s+strict(?P=q1);\s*
|
(?P<q1>["\'])use\s+strict(?P=q1);\s*
|
||||||
(?P<code>
|
(?P<code>
|
||||||
@ -2256,49 +2182,24 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
(?P<value>
|
(?P<value>
|
||||||
(?P<q2>["\'])(?:(?!(?P=q2)).|\\.)+(?P=q2)
|
(?P<q2>["\'])(?:(?!(?P=q2)).|\\.)+(?P=q2)
|
||||||
\.split\((?P<q3>["\'])(?:(?!(?P=q3)).)+(?P=q3)\)
|
\.split\((?P<q3>["\'])(?:(?!(?P=q3)).)+(?P=q3)\)
|
||||||
|\[\s*(?:(?P<q4>["\'])(?:(?!(?P=q4)).|\\.)*(?P=q4)\s*,?\s*)+\]
|
|
||||||
)
|
)
|
||||||
)[;,]
|
)[;,]
|
||||||
''', jscode, 'global variable', group=('code', 'name', 'value'), default=(None, None, None))
|
''', jscode, 'global variable', group=('code', 'name', 'value'), default=(None, None, None))
|
||||||
if not varcode:
|
|
||||||
self.write_debug(join_nonempty(
|
|
||||||
'No global array variable found in player JS',
|
|
||||||
player_url and f' player = {player_url}', delim='\n'), only_once=True)
|
|
||||||
return varcode, varname, varvalue
|
|
||||||
|
|
||||||
def _interpret_player_js_global_var(self, jscode, player_url):
|
def _fixup_n_function_code(self, argnames, code, full_code):
|
||||||
"""Returns tuple of: variable name string, variable value list"""
|
global_var, varname, _ = self._extract_player_js_global_var(full_code)
|
||||||
_, varname, array_code = self._extract_player_js_global_var(jscode, player_url)
|
if global_var:
|
||||||
jsi = JSInterpreter(array_code)
|
self.write_debug(f'Prepending n function code with global array variable "{varname}"')
|
||||||
interpret_global_var = self._cached(jsi.interpret_expression, 'js global list', player_url)
|
code = global_var + '; ' + code
|
||||||
return varname, interpret_global_var(array_code, {}, allow_recursion=10)
|
|
||||||
|
|
||||||
def _fixup_n_function_code(self, argnames, nsig_code, jscode, player_url):
|
|
||||||
varcode, varname, _ = self._extract_player_js_global_var(jscode, player_url)
|
|
||||||
if varcode and varname:
|
|
||||||
nsig_code = varcode + '; ' + nsig_code
|
|
||||||
_, global_list = self._interpret_player_js_global_var(jscode, player_url)
|
|
||||||
else:
|
else:
|
||||||
varname = 'dlp_wins'
|
self.write_debug('No global array variable found in player JS')
|
||||||
global_list = []
|
return argnames, re.sub(
|
||||||
|
rf';\s*if\s*\(\s*typeof\s+[a-zA-Z0-9_$]+\s*===?\s*(?:(["\'])undefined\1|{varname}\[\d+\])\s*\)\s*return\s+{argnames[0]};',
|
||||||
undefined_idx = global_list.index('undefined') if 'undefined' in global_list else r'\d+'
|
';', code)
|
||||||
fixed_code = re.sub(
|
|
||||||
rf'''(?x)
|
|
||||||
;\s*if\s*\(\s*typeof\s+[a-zA-Z0-9_$]+\s*===?\s*(?:
|
|
||||||
(["\'])undefined\1|
|
|
||||||
{re.escape(varname)}\[{undefined_idx}\]
|
|
||||||
)\s*\)\s*return\s+{re.escape(argnames[0])};
|
|
||||||
''', ';', nsig_code)
|
|
||||||
if fixed_code == nsig_code:
|
|
||||||
self.write_debug(join_nonempty(
|
|
||||||
'No typeof statement found in nsig function code',
|
|
||||||
player_url and f' player = {player_url}', delim='\n'), only_once=True)
|
|
||||||
return argnames, fixed_code
|
|
||||||
|
|
||||||
def _extract_n_function_code(self, video_id, player_url):
|
def _extract_n_function_code(self, video_id, player_url):
|
||||||
player_id = self._extract_player_info(player_url)
|
player_id = self._extract_player_info(player_url)
|
||||||
func_code = self._load_nsig_code_from_cache(player_url)
|
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2025.03.25')
|
||||||
jscode = func_code or self._load_player(video_id, player_url)
|
jscode = func_code or self._load_player(video_id, player_url)
|
||||||
jsi = JSInterpreter(jscode)
|
jsi = JSInterpreter(jscode)
|
||||||
|
|
||||||
@ -2308,8 +2209,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
func_name = self._extract_n_function_name(jscode, player_url=player_url)
|
func_name = self._extract_n_function_name(jscode, player_url=player_url)
|
||||||
|
|
||||||
# XXX: Workaround for the global array variable and lack of `typeof` implementation
|
# XXX: Workaround for the global array variable and lack of `typeof` implementation
|
||||||
func_code = self._fixup_n_function_code(*jsi.extract_function_code(func_name), jscode, player_url)
|
func_code = self._fixup_n_function_code(*jsi.extract_function_code(func_name), jscode)
|
||||||
|
|
||||||
|
self.cache.store('youtube-nsig', player_id, func_code)
|
||||||
return jsi, player_id, func_code
|
return jsi, player_id, func_code
|
||||||
|
|
||||||
def _extract_n_function_from_code(self, jsi, func_code):
|
def _extract_n_function_from_code(self, jsi, func_code):
|
||||||
@ -3261,8 +3163,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if player_url:
|
if player_url:
|
||||||
self.report_warning(
|
self.report_warning(
|
||||||
f'nsig extraction failed: Some formats may be missing\n'
|
f'nsig extraction failed: Some formats may be missing\n'
|
||||||
f' n = {query["n"][0]} ; player = {player_url}\n'
|
f' n = {query["n"][0]} ; player = {player_url}',
|
||||||
f' {bug_reports_message(before="")}',
|
|
||||||
video_id=video_id, only_once=True)
|
video_id=video_id, only_once=True)
|
||||||
self.write_debug(e, only_once=True)
|
self.write_debug(e, only_once=True)
|
||||||
else:
|
else:
|
||||||
@ -3280,7 +3181,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
is_damaged = try_call(lambda: format_duration < duration // 2)
|
is_damaged = try_call(lambda: format_duration < duration // 2)
|
||||||
if is_damaged:
|
if is_damaged:
|
||||||
self.report_warning(
|
self.report_warning(
|
||||||
'Some formats are possibly damaged. They will be deprioritized', video_id, only_once=True)
|
f'{video_id}: Some formats are possibly damaged. They will be deprioritized', only_once=True)
|
||||||
|
|
||||||
po_token = fmt.get(STREAMING_DATA_INITIAL_PO_TOKEN)
|
po_token = fmt.get(STREAMING_DATA_INITIAL_PO_TOKEN)
|
||||||
|
|
||||||
@ -3646,8 +3547,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if 'sign in' in reason.lower():
|
if 'sign in' in reason.lower():
|
||||||
reason = remove_end(reason, 'This helps protect our community. Learn more')
|
reason = remove_end(reason, 'This helps protect our community. Learn more')
|
||||||
reason = f'{remove_end(reason.strip(), ".")}. {self._youtube_login_hint}'
|
reason = f'{remove_end(reason.strip(), ".")}. {self._youtube_login_hint}'
|
||||||
elif get_first(playability_statuses, ('errorScreen', 'playerCaptchaViewModel', {dict})):
|
|
||||||
reason += '. YouTube is requiring a captcha challenge before playback'
|
|
||||||
self.raise_no_formats(reason, expected=True)
|
self.raise_no_formats(reason, expected=True)
|
||||||
|
|
||||||
keywords = get_first(video_details, 'keywords', expected_type=list) or []
|
keywords = get_first(video_details, 'keywords', expected_type=list) or []
|
||||||
@ -3876,7 +3775,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
if not traverse_obj(initial_data, 'contents'):
|
if not traverse_obj(initial_data, 'contents'):
|
||||||
self.report_warning('Incomplete data received in embedded initial data; re-fetching using API.')
|
self.report_warning('Incomplete data received in embedded initial data; re-fetching using API.')
|
||||||
initial_data = None
|
initial_data = None
|
||||||
if not initial_data and 'initial_data' not in self._configuration_arg('player_skip'):
|
if not initial_data:
|
||||||
query = {'videoId': video_id}
|
query = {'videoId': video_id}
|
||||||
query.update(self._get_checkok_params())
|
query.update(self._get_checkok_params())
|
||||||
initial_data = self._extract_response(
|
initial_data = self._extract_response(
|
||||||
|
@ -188,7 +188,6 @@ _COMP_OPERATORS = {'===', '!==', '==', '!=', '<=', '>=', '<', '>'}
|
|||||||
_NAME_RE = r'[a-zA-Z_$][\w$]*'
|
_NAME_RE = r'[a-zA-Z_$][\w$]*'
|
||||||
_MATCHING_PARENS = dict(zip(*zip('()', '{}', '[]')))
|
_MATCHING_PARENS = dict(zip(*zip('()', '{}', '[]')))
|
||||||
_QUOTES = '\'"/'
|
_QUOTES = '\'"/'
|
||||||
_NESTED_BRACKETS = r'[^[\]]+(?:\[[^[\]]+(?:\[[^\]]+\])?\])?'
|
|
||||||
|
|
||||||
|
|
||||||
class JS_Undefined:
|
class JS_Undefined:
|
||||||
@ -607,18 +606,15 @@ class JSInterpreter:
|
|||||||
|
|
||||||
m = re.match(fr'''(?x)
|
m = re.match(fr'''(?x)
|
||||||
(?P<assign>
|
(?P<assign>
|
||||||
(?P<out>{_NAME_RE})(?:\[(?P<index>{_NESTED_BRACKETS})\])?\s*
|
(?P<out>{_NAME_RE})(?:\[(?P<index>[^\]]+?)\])?\s*
|
||||||
(?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})?
|
(?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})?
|
||||||
=(?!=)(?P<expr>.*)$
|
=(?!=)(?P<expr>.*)$
|
||||||
)|(?P<return>
|
)|(?P<return>
|
||||||
(?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$
|
(?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$
|
||||||
)|(?P<attribute>
|
|
||||||
(?P<var>{_NAME_RE})(?:
|
|
||||||
(?P<nullish>\?)?\.(?P<member>[^(]+)|
|
|
||||||
\[(?P<member2>{_NESTED_BRACKETS})\]
|
|
||||||
)\s*
|
|
||||||
)|(?P<indexing>
|
)|(?P<indexing>
|
||||||
(?P<in>{_NAME_RE})\[(?P<idx>.+)\]$
|
(?P<in>{_NAME_RE})\[(?P<idx>.+)\]$
|
||||||
|
)|(?P<attribute>
|
||||||
|
(?P<var>{_NAME_RE})(?:(?P<nullish>\?)?\.(?P<member>[^(]+)|\[(?P<member2>[^\]]+)\])\s*
|
||||||
)|(?P<function>
|
)|(?P<function>
|
||||||
(?P<fname>{_NAME_RE})\((?P<args>.*)\)$
|
(?P<fname>{_NAME_RE})\((?P<args>.*)\)$
|
||||||
)''', expr)
|
)''', expr)
|
||||||
@ -711,7 +707,7 @@ class JSInterpreter:
|
|||||||
if obj is NO_DEFAULT:
|
if obj is NO_DEFAULT:
|
||||||
if variable not in self._objects:
|
if variable not in self._objects:
|
||||||
try:
|
try:
|
||||||
self._objects[variable] = self.extract_object(variable, local_vars)
|
self._objects[variable] = self.extract_object(variable)
|
||||||
except self.Exception:
|
except self.Exception:
|
||||||
if not nullish:
|
if not nullish:
|
||||||
raise
|
raise
|
||||||
@ -851,7 +847,7 @@ class JSInterpreter:
|
|||||||
raise self.Exception('Cannot return from an expression', expr)
|
raise self.Exception('Cannot return from an expression', expr)
|
||||||
return ret
|
return ret
|
||||||
|
|
||||||
def extract_object(self, objname, *global_stack):
|
def extract_object(self, objname):
|
||||||
_FUNC_NAME_RE = r'''(?:[a-zA-Z$0-9]+|"[a-zA-Z$0-9]+"|'[a-zA-Z$0-9]+')'''
|
_FUNC_NAME_RE = r'''(?:[a-zA-Z$0-9]+|"[a-zA-Z$0-9]+"|'[a-zA-Z$0-9]+')'''
|
||||||
obj = {}
|
obj = {}
|
||||||
obj_m = re.search(
|
obj_m = re.search(
|
||||||
@ -873,8 +869,7 @@ class JSInterpreter:
|
|||||||
for f in fields_m:
|
for f in fields_m:
|
||||||
argnames = f.group('args').split(',')
|
argnames = f.group('args').split(',')
|
||||||
name = remove_quotes(f.group('key'))
|
name = remove_quotes(f.group('key'))
|
||||||
obj[name] = function_with_repr(
|
obj[name] = function_with_repr(self.build_function(argnames, f.group('code')), f'F<{name}>')
|
||||||
self.build_function(argnames, f.group('code'), *global_stack), f'F<{name}>')
|
|
||||||
|
|
||||||
return obj
|
return obj
|
||||||
|
|
||||||
|
@ -3,7 +3,6 @@ import warnings
|
|||||||
|
|
||||||
from .common import (
|
from .common import (
|
||||||
HEADRequest,
|
HEADRequest,
|
||||||
PATCHRequest,
|
|
||||||
PUTRequest,
|
PUTRequest,
|
||||||
Request,
|
Request,
|
||||||
RequestDirector,
|
RequestDirector,
|
||||||
|
@ -505,7 +505,6 @@ class Request:
|
|||||||
|
|
||||||
|
|
||||||
HEADRequest = functools.partial(Request, method='HEAD')
|
HEADRequest = functools.partial(Request, method='HEAD')
|
||||||
PATCHRequest = functools.partial(Request, method='PATCH')
|
|
||||||
PUTRequest = functools.partial(Request, method='PUT')
|
PUTRequest = functools.partial(Request, method='PUT')
|
||||||
|
|
||||||
|
|
||||||
|
@ -150,15 +150,6 @@ class _YoutubeDLHelpFormatter(optparse.IndentedHelpFormatter):
|
|||||||
return opts
|
return opts
|
||||||
|
|
||||||
|
|
||||||
_PRESET_ALIASES = {
|
|
||||||
'mp3': ['-f', 'ba[acodec^=mp3]/ba/b', '-x', '--audio-format', 'mp3'],
|
|
||||||
'aac': ['-f', 'ba[acodec^=aac]/ba[acodec^=mp4a.40.]/ba/b', '-x', '--audio-format', 'aac'],
|
|
||||||
'mp4': ['--merge-output-format', 'mp4', '--remux-video', 'mp4', '-S', 'vcodec:h264,lang,quality,res,fps,hdr:12,acodec:aac'],
|
|
||||||
'mkv': ['--merge-output-format', 'mkv', '--remux-video', 'mkv'],
|
|
||||||
'sleep': ['--sleep-subtitles', '5', '--sleep-requests', '0.75', '--sleep-interval', '10', '--max-sleep-interval', '20'],
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class _YoutubeDLOptionParser(optparse.OptionParser):
|
class _YoutubeDLOptionParser(optparse.OptionParser):
|
||||||
# optparse is deprecated since Python 3.2. So assume a stable interface even for private methods
|
# optparse is deprecated since Python 3.2. So assume a stable interface even for private methods
|
||||||
ALIAS_DEST = '_triggered_aliases'
|
ALIAS_DEST = '_triggered_aliases'
|
||||||
@ -224,22 +215,6 @@ class _YoutubeDLOptionParser(optparse.OptionParser):
|
|||||||
return e.possibilities[0]
|
return e.possibilities[0]
|
||||||
raise
|
raise
|
||||||
|
|
||||||
def format_option_help(self, formatter=None):
|
|
||||||
assert formatter, 'Formatter can not be None'
|
|
||||||
formatted_help = super().format_option_help(formatter=formatter)
|
|
||||||
formatter.indent()
|
|
||||||
heading = formatter.format_heading('Preset Aliases')
|
|
||||||
formatter.indent()
|
|
||||||
result = []
|
|
||||||
for name, args in _PRESET_ALIASES.items():
|
|
||||||
option = optparse.Option('-t', help=shlex.join(args))
|
|
||||||
formatter.option_strings[option] = f'-t {name}'
|
|
||||||
result.append(formatter.format_option(option))
|
|
||||||
formatter.dedent()
|
|
||||||
formatter.dedent()
|
|
||||||
help_lines = '\n'.join(result)
|
|
||||||
return f'{formatted_help}\n{heading}{help_lines}'
|
|
||||||
|
|
||||||
|
|
||||||
def create_parser():
|
def create_parser():
|
||||||
def _list_from_options_callback(option, opt_str, value, parser, append=True, delim=',', process=str.strip):
|
def _list_from_options_callback(option, opt_str, value, parser, append=True, delim=',', process=str.strip):
|
||||||
@ -342,13 +317,6 @@ def create_parser():
|
|||||||
parser.rargs[:0] = shlex.split(
|
parser.rargs[:0] = shlex.split(
|
||||||
opts if value is None else opts.format(*map(shlex.quote, value)))
|
opts if value is None else opts.format(*map(shlex.quote, value)))
|
||||||
|
|
||||||
def _preset_alias_callback(option, opt_str, value, parser):
|
|
||||||
if not value:
|
|
||||||
return
|
|
||||||
if value not in _PRESET_ALIASES:
|
|
||||||
raise optparse.OptionValueError(f'Unknown preset alias: {value}')
|
|
||||||
parser.rargs[:0] = _PRESET_ALIASES[value]
|
|
||||||
|
|
||||||
general = optparse.OptionGroup(parser, 'General Options')
|
general = optparse.OptionGroup(parser, 'General Options')
|
||||||
general.add_option(
|
general.add_option(
|
||||||
'-h', '--help', dest='print_help', action='store_true',
|
'-h', '--help', dest='print_help', action='store_true',
|
||||||
@ -532,8 +500,7 @@ def create_parser():
|
|||||||
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
|
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
|
||||||
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
|
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
|
||||||
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
|
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
|
||||||
'2023': ['2024', 'prefer-vp9-sort'],
|
'2023': ['prefer-vp9-sort'],
|
||||||
'2024': [],
|
|
||||||
},
|
},
|
||||||
}, help=(
|
}, help=(
|
||||||
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
|
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
|
||||||
@ -551,15 +518,6 @@ def create_parser():
|
|||||||
'Alias options can trigger more aliases; so be careful to avoid defining recursive options. '
|
'Alias options can trigger more aliases; so be careful to avoid defining recursive options. '
|
||||||
f'As a safety measure, each alias may be triggered a maximum of {_YoutubeDLOptionParser.ALIAS_TRIGGER_LIMIT} times. '
|
f'As a safety measure, each alias may be triggered a maximum of {_YoutubeDLOptionParser.ALIAS_TRIGGER_LIMIT} times. '
|
||||||
'This option can be used multiple times'))
|
'This option can be used multiple times'))
|
||||||
general.add_option(
|
|
||||||
'-t', '--preset-alias',
|
|
||||||
metavar='PRESET', dest='_', type='str',
|
|
||||||
action='callback', callback=_preset_alias_callback,
|
|
||||||
help=(
|
|
||||||
'Applies a predefined set of options. e.g. --preset-alias mp3. '
|
|
||||||
f'The following presets are available: {", ".join(_PRESET_ALIASES)}. '
|
|
||||||
'See the "Preset Aliases" section at the end for more info. '
|
|
||||||
'This option can be used multiple times'))
|
|
||||||
|
|
||||||
network = optparse.OptionGroup(parser, 'Network Options')
|
network = optparse.OptionGroup(parser, 'Network Options')
|
||||||
network.add_option(
|
network.add_option(
|
||||||
|
@ -2044,7 +2044,7 @@ def url_or_none(url):
|
|||||||
if not url or not isinstance(url, str):
|
if not url or not isinstance(url, str):
|
||||||
return None
|
return None
|
||||||
url = url.strip()
|
url = url.strip()
|
||||||
return url if re.match(r'(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?|wss?):)?//', url) else None
|
return url if re.match(r'(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?):)?//', url) else None
|
||||||
|
|
||||||
|
|
||||||
def strftime_or_none(timestamp, date_format='%Y%m%d', default=None):
|
def strftime_or_none(timestamp, date_format='%Y%m%d', default=None):
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
# Autogenerated by devscripts/update-version.py
|
# Autogenerated by devscripts/update-version.py
|
||||||
|
|
||||||
__version__ = '2025.03.31'
|
__version__ = '2025.03.25'
|
||||||
|
|
||||||
RELEASE_GIT_HEAD = '5e457af57fae9645b1b8fa0ed689229c8fb9656b'
|
RELEASE_GIT_HEAD = '9dde546e7ee3e1515d88ee3af08b099351455dc0'
|
||||||
|
|
||||||
VARIANT = None
|
VARIANT = None
|
||||||
|
|
||||||
@ -12,4 +12,4 @@ CHANNEL = 'stable'
|
|||||||
|
|
||||||
ORIGIN = 'yt-dlp/yt-dlp'
|
ORIGIN = 'yt-dlp/yt-dlp'
|
||||||
|
|
||||||
_pkg_version = '2025.03.31'
|
_pkg_version = '2025.03.25'
|
||||||
|
Loading…
x
Reference in New Issue
Block a user