字幕生成语音-处理长度超限的问题

已更新至1.103.5版,目前仅安装了字幕生成语音的功能,用豆包查了一下日志,提示是长度超限的问题“ 从你提供的日志中,可明确文本转语音(TTS)服务运行失败 ,核心问题是 “模型处理长度超限” 和 “衍生的文件不存在”,节选日志如下:
2025-08-31 09:44:19.280 [info] :hourglass_flowing_sand:准备拆分文本
2025-08-31 09:44:29.944 [info] :hourglass_flowing_sand:准备进行文本生成
2025-08-31 09:44:30.292 [info] :hourglass_flowing_sand::arrow_forward:初始化:准备启动
2025-08-31 09:45:40.130 [info] INFO: Started server process [11956]
INFO: Waiting for application startup.

2025-08-31 09:45:40.133 [info] INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:9900 (Press CTRL+C to quit)

2025-08-31 09:45:40.145 [info] :hourglass_flowing_sand:准备生成: 好的,各位,欢迎来到2016年9月ICT月度导师计划的第一个教学课程。
2025-08-31 09:45:40.147 [info] :white_check_mark:存在模型
2025-08-31 09:45:40.178 [info] INFO: 127.0.0.1:4696 - “POST /tts/IndexTTS/loadModel HTTP/1.1” 200 OK

2025-08-31 09:45:58.251 [info] 2025-08-31 09:45:58,250 WETEXT INFO building fst for zh_normalizer …

2025-08-31 09:46:57.601 [info] 2025-08-31 09:46:57,599 WETEXT INFO done
2025-08-31 09:46:57,599 WETEXT INFO fst path: c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_tagger.fst
2025-08-31 09:46:57,599 WETEXT INFO c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_verbalizer.fst

2025-08-31 09:46:57.736 [info] 2025-08-31 09:46:57,735 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\tn\en_tn_tagger.fst
2025-08-31 09:46:57,735 WETEXT INFO c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\tn\en_tn_verbalizer.fst
2025-08-31 09:46:57,735 WETEXT INFO skip building fst for en_normalizer …

2025-08-31 09:47:04.864 [info] >> Be patient, it may take a while to run in CPU mode.

GPT weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\gpt.pth
Removing weight norm…
bigvgan weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\bigvgan_generator.pth
TextNormalizer loaded
bpe model loaded from: c:/Documents/shenghuabi/config/pythonAddon/model\bpe.model
start inference…
INFO: 127.0.0.1:4697 - “POST /tts/IndexTTS/text2speech HTTP/1.1” 500 Internal Server Error

2025-08-31 09:47:04.869 [info] :hourglass_flowing_sand:准备生成: 这是八个课程中的第一个。每个月你都会收到八个单独的教学视频,这些课程紧扣当月主题,并相互呼应,让整体更为完整。
2025-08-31 09:47:04.870 [info] :white_check_mark:存在模型
2025-08-31 09:47:05.412 [info] ERROR: Exception in ASGI application
Traceback (most recent call last):
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\uvicorn\protocols\http\h11_impl.py”, line 403, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\uvicorn\middleware\proxy_headers.py”, line 60, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\fastapi\applications.py”, line 1054, in call
await super().call(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\applications.py”, line 112, in call
await self.middleware_stack(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\middleware\errors.py”, line 187, in call
raise exc
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\middleware\errors.py”, line 165, in call
await self.app(scope, receive, _send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\middleware\exceptions.py”, line 62, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 53, in wrapped_app
raise exc
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 42, in wrapped_app
await app(scope, receive, sender)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 714, in call
await self.middleware_stack(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 734, in app
await route.handle(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 288, in handle
await self.app(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 53, in wrapped_app
raise exc
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 42, in wrapped_app
await app(scope, receive, sender)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 73, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\fastapi\routing.py”, line 301, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\fastapi\routing.py”, line 212, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib\main.py”, line 43, in text2speech
await indexTTS.text2Speech(item.audioPath, item.text, item.options.sentence[‘max_text_tokens_per_sentence’], item.options.generation.model_dump(), item.output)
File “c:\Documents\shenghuabi\config\pythonAddon\lib\tts\indextts.py”, line 24, in text2Speech
result = self.tts.infer(
^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\infer.py”, line 573, in infer
codes = self.gpt.inference_speech(auto_conditioning, text_tokens,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\model.py”, line 670, in inference_speech
conds_latent = self.get_conditioning(speech_conditioning_mel, cond_mel_lengths)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\model.py”, line 497, in get_conditioning
speech_conditioning_input, mask = self.conditioning_encoder(speech_conditioning_input.transpose(1, 2),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer_encoder.py”, line 426, in forward
xs, pos_emb, masks = self.embed(xs, masks)
^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer\subsampling.py”, line 185, in forward
x, pos_emb = self.pos_enc(x, offset)
^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer\embedding.py”, line 140, in forward
pos_emb = self.position_encoding(offset, x.size(1), False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer\embedding.py”, line 97, in position_encoding
assert offset + size < self.max_len
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError


2025-08-31 10:02:16.770 [error] [Error: ENOENT: no such file or directory, open ‘c:\Documents\shenghuabi\config\pythonAddon\chunk\65a9013f-f8a8-5592-b37e-99e6dd595fc4.wav’] {
errno: -4058,
code: ‘ENOENT’,
syscall: ‘open’,
path: ‘c:\Documents\shenghuabi\config\pythonAddon\chunk\65a9013f-f8a8-5592-b37e-99e6dd595fc4.wav’
}

不是长度的问题.
因为这句话还没有那么长
感觉有一定可能是模型不完整?事实删除掉model文件夹重新下载试试?


也可以先把这句话截短再生成下试试,看看是不是长度问题.我现在在外面,没法调试,等回来再看看

试了几个办法都不行,试过的方法如下:
1、更新到1.103.6
2、删除pythonAddon下所有文件,重新下载模型
3、用字幕编辑软件内置工具修复潜在错误
4、把每行字幕调整到26个字符
5、只保留两行字幕

以下是模型的大小和sha2-256,可以和您的比较下是否一致

sha2-256可以用7zip右键计算

46e1f6277f7239363d2393f2f9fe36902cf8995e4acc0ba67ed25a025dbd02f0 bigvgan_discriminator.pth
a2458834d8277e76eb8614c9751b5e8eaa0474eab706f0ecfafcb600023133ed bigvgan_generator.pth
b2a5ce8090d32da3642cc4f81fdc996376bc6dd3f4cd5e3d165f71120d9f2bc8 bpe.model
864280aeb82a722ce561078c7f7f16a63b0c6cb270e9b9566a8ce5bffef38954 config.yaml
69e841bf8cd97a32806ea8a439c50017c991ac9e8bb795db89ec47828cae4d5d dvae.pth
44460b820a8afd58f68f3d3e69113e7900c8730bf519ecf158c081f2b8991240 gpt.pth
ef28cd7df6490fa782fabbca4e9106cd0517ca7c57a057f978de27b2885a5d1b unigram_12000.vocab


我测试您的句子目前是正常生存的.
请问您用的是cuda还是zluda?可以先下载个纯cpu的试试.一步一步排查

另外我测试了

这是八个课程中的第一个。每个月你都会收到八个单独的教学视频,这些课程紧扣当月主题,并相互呼应,让整体更为完整。

这句可以正常生成,和文本长度没关系

您是只有这一句无法生成吗?还是任意一句,哪怕打一个1也没法正常生成?

果然是打个1都报错,会不会是端口冲突,或者其他原因导致服务启动失败呢?
SHA和文件大小我都对了,都是一致的
2025-08-31 22:01:52.841 [info] :hourglass_flowing_sand:准备拆分文本
2025-08-31 22:01:58.775 [info] :hourglass_flowing_sand:准备进行文本生成
2025-08-31 22:01:59.172 [info] :hourglass_flowing_sand::arrow_forward:初始化:准备启动
2025-08-31 22:03:00.463 [info] INFO: Started server process [19928]
INFO: Waiting for application startup.

2025-08-31 22:03:00.464 [info] INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:9900 (Press CTRL+C to quit)

2025-08-31 22:03:00.468 [info] :hourglass_flowing_sand:准备生成: 1
2025-08-31 22:03:00.468 [info] :white_check_mark:存在模型
2025-08-31 22:03:00.497 [info] INFO: 127.0.0.1:3387 - “POST /tts/IndexTTS/loadModel HTTP/1.1” 200 OK

2025-08-31 22:03:18.251 [info] 2025-08-31 22:03:18,249 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_tagger.fst
2025-08-31 22:03:18,249 WETEXT INFO c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_verbalizer.fst
2025-08-31 22:03:18,249 WETEXT INFO skip building fst for zh_normalizer …

2025-08-31 22:03:19.308 [info] 2025-08-31 22:03:19,307 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\tn\en_tn_tagger.fst
2025-08-31 22:03:19,307 WETEXT INFO c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\tn\en_tn_verbalizer.fst
2025-08-31 22:03:19,307 WETEXT INFO skip building fst for en_normalizer …

2025-08-31 22:03:26.396 [info] >> Be patient, it may take a while to run in CPU mode.

GPT weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\gpt.pth
Removing weight norm…
bigvgan weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\bigvgan_generator.pth
TextNormalizer loaded
bpe model loaded from: c:/Documents/shenghuabi/config/pythonAddon/model\bpe.model
start inference…
INFO: 127.0.0.1:3388 - “POST /tts/IndexTTS/text2speech HTTP/1.1” 500 Internal Server Error

2025-08-31 22:03:26.399 [info] :hourglass_flowing_sand:准备进行语音拼接
2025-08-31 22:03:26.403 [error] [Error: ENOENT: no such file or directory, open

不是端口冲突,因为如果冲突,连模型加载也不会发生

这一次的报错还和上一次一样?(排除掉Error: ENOENT: no such file or directory, open这报错,因为这个报错是因为执行失败出现的)

是的,就一个1也是一样的错误,不知道这个“INFO: 127.0.0.1:1347 - “POST /tts/IndexTTS/text2speech HTTP/1.1” 500 Internal Server Error”是什么原因导致的。昨天装了一个index-tts倒是能正常测试,详细日志如下:
2025-09-01 14:24:26.732 [info] :hourglass_flowing_sand:准备拆分文本
2025-09-01 14:24:31.469 [info] :hourglass_flowing_sand:准备进行文本生成
2025-09-01 14:24:31.911 [info] :hourglass_flowing_sand::arrow_forward:初始化:准备启动
2025-09-01 14:25:39.450 [info] INFO: Started server process [21052]
INFO: Waiting for application startup.

2025-09-01 14:25:39.451 [info] INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:9900 (Press CTRL+C to quit)

2025-09-01 14:25:39.461 [info] :hourglass_flowing_sand:准备生成: 1
2025-09-01 14:25:39.462 [info] :white_check_mark:存在模型
2025-09-01 14:25:39.487 [info] INFO: 127.0.0.1:1346 - “POST /tts/IndexTTS/loadModel HTTP/1.1” 200 OK

2025-09-01 14:25:58.629 [info] 2025-09-01 14:25:58,628 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_tagger.fst
2025-09-01 14:25:58,628 WETEXT INFO c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_verbalizer.fst
2025-09-01 14:25:58,628 WETEXT INFO skip building fst for zh_normalizer …

2025-09-01 14:25:59.827 [info] 2025-09-01 14:25:59,826 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\tn\en_tn_tagger.fst
2025-09-01 14:25:59,826 WETEXT INFO c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\tn\en_tn_verbalizer.fst
2025-09-01 14:25:59,826 WETEXT INFO skip building fst for en_normalizer …

2025-09-01 14:26:07.285 [info] >> Be patient, it may take a while to run in CPU mode.

GPT weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\gpt.pth
Removing weight norm…
bigvgan weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\bigvgan_generator.pth
TextNormalizer loaded
bpe model loaded from: c:/Documents/shenghuabi/config/pythonAddon/model\bpe.model
start inference…
INFO: 127.0.0.1:1347 - “POST /tts/IndexTTS/text2speech HTTP/1.1” 500 Internal Server Error

2025-09-01 14:26:07.289 [info] :hourglass_flowing_sand:准备进行语音拼接
2025-09-01 14:26:07.295 [error] [Error: ENOENT: no such file or directory, open ‘c:\Documents\shenghuabi\config\pythonAddon\chunk\5a560745-2fa8-5743-9c6c-6a12c2e7c64d.wav’] {
errno: -4058,
code: ‘ENOENT’,
syscall: ‘open’,
path: ‘c:\Documents\shenghuabi\config\pythonAddon\chunk\5a560745-2fa8-5743-9c6c-6a12c2e7c64d.wav’
}
2025-09-01 14:26:07.857 [info] ERROR: Exception in ASGI application
Traceback (most recent call last):
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\uvicorn\protocols\http\h11_impl.py”, line 403, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\uvicorn\middleware\proxy_headers.py”, line 60, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\fastapi\applications.py”, line 1054, in call
await super().call(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\applications.py”, line 112, in call
await self.middleware_stack(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\middleware\errors.py”, line 187, in call
raise exc
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\middleware\errors.py”, line 165, in call
await self.app(scope, receive, _send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\middleware\exceptions.py”, line 62, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 53, in wrapped_app
raise exc
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 42, in wrapped_app
await app(scope, receive, sender)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 714, in call
await self.middleware_stack(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 734, in app
await route.handle(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 288, in handle
await self.app(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 53, in wrapped_app
raise exc
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette_exception_handler.py”, line 42, in wrapped_app
await app(scope, receive, sender)
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\starlette\routing.py”, line 73, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\fastapi\routing.py”, line 301, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\fastapi\routing.py”, line 212, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib\main.py”, line 43, in text2speech
await indexTTS.text2Speech(item.audioPath, item.text, item.options.sentence[‘max_text_tokens_per_sentence’], item.options.generation.model_dump(), item.output)
File “c:\Documents\shenghuabi\config\pythonAddon\lib\tts\indextts.py”, line 24, in text2Speech
result = self.tts.infer(
^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\infer.py”, line 573, in infer
codes = self.gpt.inference_speech(auto_conditioning, text_tokens,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\model.py”, line 670, in inference_speech
conds_latent = self.get_conditioning(speech_conditioning_mel, cond_mel_lengths)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\model.py”, line 497, in get_conditioning
speech_conditioning_input, mask = self.conditioning_encoder(speech_conditioning_input.transpose(1, 2),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer_encoder.py”, line 426, in forward
xs, pos_emb, masks = self.embed(xs, masks)
^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer\subsampling.py”, line 185, in forward
x, pos_emb = self.pos_enc(x, offset)
^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\torch\nn\modules\module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer\embedding.py”, line 140, in forward
pos_emb = self.position_encoding(offset, x.size(1), False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “c:\Documents\shenghuabi\config\pythonAddon\lib.venv\Lib\site-packages\indextts\gpt\conformer\embedding.py”, line 97, in position_encoding
assert offset + size < self.max_len
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

已更新1.103.8版本,增加了一些调试输出,再报错后把日志再发下,我看看是否是配置问题

详细日志如下:

2025-09-01 18:46:35.368 [info] ⏳准备拆分文本
2025-09-01 18:46:35.368 [info] 读取缓存
2025-09-01 18:46:41.311 [info] ⏳准备进行文本生成
2025-09-01 18:46:41.736 [info] ⏳▶️初始化:准备启动
2025-09-01 18:46:55.330 [info] INFO:     Started server process [22828]

2025-09-01 18:46:55.331 [info] INFO:     Waiting for application startup.
INFO:     Application startup complete.

2025-09-01 18:46:55.333 [info] INFO:     Uvicorn running on http://127.0.0.1:9900 (Press CTRL+C to quit)

2025-09-01 18:46:55.340 [info] ⏳准备生成: c:\Documents\shenghuabi\config\pythonAddon\reference\ICT-EN\english\default-1756630988448.wav 1 {"generation":{"do_sample":true,"top_p":0.8,"top_k":30,"temperature":1,"length_penalty":0,"num_beams":3,"repetition_penalty":10,"max_mel_tokens":600},"sentence":{"max_text_tokens_per_sentence":120,"sentences_bucket_max_size":4}} c:/Documents/shenghuabi/config/pythonAddon/chunk/5a560745-2fa8-5743-9c6c-6a12c2e7c64d.wav
2025-09-01 18:46:55.342 [info] ✅存在模型
2025-09-01 18:46:55.377 [info] INFO:     127.0.0.1:8795 - "POST /tts/IndexTTS/loadModel HTTP/1.1" 200 OK

2025-09-01 18:47:17.906 [info] 2025-09-01 18:47:17,904 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_tagger.fst
2025-09-01 18:47:17,905 WETEXT INFO                     c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\utils\tagger_cache\zh_tn_verbalizer.fst
2025-09-01 18:47:17,905 WETEXT INFO skip building fst for zh_normalizer ...

2025-09-01 18:47:18.847 [info] 2025-09-01 18:47:18,845 WETEXT INFO found existing fst: c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\tn\en_tn_tagger.fst
2025-09-01 18:47:18,846 WETEXT INFO                     c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\tn\en_tn_verbalizer.fst

2025-09-01 18:47:18.847 [info] 2025-09-01 18:47:18,846 WETEXT INFO skip building fst for en_normalizer ...

2025-09-01 18:47:26.187 [info] >> Be patient, it may take a while to run in CPU mode.
>> GPT weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\gpt.pth
Removing weight norm...
>> bigvgan weights restored from: c:/Documents/shenghuabi/config/pythonAddon/model\bigvgan_generator.pth
>> TextNormalizer loaded
>> bpe model loaded from: c:/Documents/shenghuabi/config/pythonAddon/model\bpe.model
>> start inference...
INFO:     127.0.0.1:8796 - "POST /tts/IndexTTS/text2speech HTTP/1.1" 500 Internal Server Error

2025-09-01 18:47:26.189 [info] ⏳准备进行语音拼接
2025-09-01 18:47:26.189 [info] 拼接列表 [{"type":"file","filePath":"c:/Documents/shenghuabi/config/pythonAddon/chunk/5a560745-2fa8-5743-9c6c-6a12c2e7c64d.wav","subtitle":{"text":"1"}}] 输出 c:/Documents/shenghuabi/config/pythonAddon/output/test.txt-2025-09-01 18-46-41
2025-09-01 18:47:26.196 [error] [Error: ENOENT: no such file or directory, open 'c:\Documents\shenghuabi\config\pythonAddon\chunk\5a560745-2fa8-5743-9c6c-6a12c2e7c64d.wav'] {
  errno: -4058,
  code: 'ENOENT',
  syscall: 'open',
  path: 'c:\\Documents\\shenghuabi\\config\\pythonAddon\\chunk\\5a560745-2fa8-5743-9c6c-6a12c2e7c64d.wav'
}
2025-09-01 18:47:26.205 [info] ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 403, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\fastapi\applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\middleware\errors.py", line 187, in __call__
    raise exc
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\middleware\errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\middleware\exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\routing.py", line 714, in __call__
    await self.middleware_stack(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\routing.py", line 734, in app
    await route.handle(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\starlette\routing.py", line 73, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\fastapi\routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\fastapi\routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\main.py", line 43, in text2speech
    await indexTTS.text2Speech(item.audioPath, item.text, item.options.sentence['max_text_tokens_per_sentence'], item.options.generation.model_dump(), item.output)
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\tts\indextts.py", line 24, in text2Speech
    result = self.tts.infer(
             ^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\infer.py", line 573, in infer
    codes = self.gpt.inference_speech(auto_conditioning, text_tokens,
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\gpt\model.py", line 670, in inference_speech
    conds_latent = self.get_conditioning(speech_conditioning_mel, cond_mel_lengths)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\gpt\model.py", line 497, in get_conditioning
    speech_conditioning_input, mask = self.conditioning_encoder(speech_conditioning_input.transpose(1, 2),
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\gpt\conformer_encoder.py", line 426, in forward
    xs, pos_emb, masks = self.embed(xs, masks)
                         ^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\gpt\conformer\subsampling.py", line 185, in forward
    x, pos_emb = self.pos_enc(x, offset)
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\gpt\conformer\embedding.py", line 140, in forward
    pos_emb = self.position_encoding(offset, x.size(1), False)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Documents\shenghuabi\config\pythonAddon\lib\.venv\Lib\site-packages\indextts\gpt\conformer\embedding.py", line 97, in position_encoding
    assert offset + size < self.max_len
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

我看了下,日志的参数完全正常…
请问您的c:\Documents\shenghuabi\config\pythonAddon\reference\ICT-EN\english\default-1756630988448.wav这路径下有这个音频的吗?它的长度是多少?(音频持续时间)
要是这也正常就不知道啥问题了 :rofl:因为里面的都是indextts的核心代码
请问您之前说之前正常,是自己手动部署正常?还是用我这软件生成正常?

确定之前能生成出音频,现在不能了?

因为这次更新没改过其他的东西…所以对于这个问题感觉很奇怪

如果我没有猜错的话,你的引用音频太长了.需要短一点,十几秒就够了.你的是几分钟?
我测试生成了一个很长的引用音频,也是这个报错
通过计算,最大应该是支持一分钟左右的引用音频

一般不都是现录音吗…怎么能有这么长的音频 :rofl:

样本音频文件在指定路径下有,时长是5分钟,生花笔在我第一次使用时的确是正常的,后来尝试更改样本音频文件后,就开始不正常了,再后来又试过重新下载模型,正常了一次,就又开始不正常了,一直到现在。
我前面回复的正常,是我昨天自己装了一个index-ttx,用index-tts的webui试了一下是可以的。

个人建议您还是要增加一些异常处理之类的内容,比如出现异常时,提示信息,然后跳过后续的操作。
还有点击生成语音之前,增加一些全局设置

这个问题倒是没有留意到,我再试一下,谢谢!

果然是这个问题,换了一个16秒的样本就可以了,上传音频的时候可以考虑加一个校验

校验我会尽量做下,主要是这次的异常不知道出在哪…看所有参数都正常,没想到是引用音频持续时间