What are some free or more affordable alternatives to Elevenlabs for AI voices?Discussion

I have an idea for a project I want to do: I want to create a mod for a video game which does TTS for all the written dialogue. Problem is, the game I want to do this for has a LOT of dialogue, much of it not spoken out loud, to the point where generating speech for every single line using an Elevenlabs subscription would be enormously expensive.

That's not even accounting for re-dos where the original line doesn't work, just assuming I'm only using the first result it gives me it would still make it completely unaffordable to me.

As I understand it, there are two ways I could accomplish this:

  1. I could either take every line of dialogue, create an audio file for it, and then load that in when it appears in-game. Lots of work up-front, but results would be higher-quality, although due to user error I'd likely miss many areas with dialogue. This would also be a lot of data assuming the user needs to install it locally. Streaming the data probably wouldn't be possible with my current setup.
  2. I could make an AI transcribe dialogue that appears in real-time, using either a server or the user's local computer to handle the workload. This would be much less work upfront, and while results would be lower-quality I think it would be fine even if the occasional line sounds kind of weird.

I'm still researching if this is even possible on a budget, and I don't have a lot of experience with TTS models. Some of the live transcription models I've seen take around 1-2 seconds for 1-2 lines of dialogue. I think 1 second of delay might be okay, but anything past 2 seconds for short quips is way too much, and I would just consider it unusable at that point.

Of those two options, I'd MUCH rather do option 2, but I straight up have no idea if it's even possible to do it in a timely manner, given current technology.

Also for those wondering about the ethics of this idea, I'd be using generic voices with actors who have consented to being used for AI, I wouldn't be cloning the voices of in-game characters. Ideally I would be able to generate material even relatively close to the level of Elevenlabs. If it sounds bad enough (Amazon Polly sounds pretty bad) I probably wouldn't even want to bother.

Any advice? I'm not finding good solutions and at this point I'll probably have to table this idea until an affordable service comes along.

I think there’s enough time between now and elections for something to happen, but in all likelihood I don’t expect anything dramatic. “Dark Brandon” was a flash in the pan moment, I don’t think he’s capable of doing anything like that purposefully.

He’s a very tired old man who likely has more on his plate than he can deal with, given his aged body and aging faculties. Trump is similar in many regards, but his difference is that incoherence is part of Trump’s brand at this point, he found a way to make that part of him work for himself.

My personal odds as of right now are like 60-40 Trump to Biden, but obviously I’m just some shmuck so take that with a grain of sand. The debate was bad, but a TON of people still really hate Trump, and that shouldn’t be overlooked. I think Biden can still pull it back but he needs an actual strategy. But there’s a high chance he doesn’t have one, given the history of his presidency.

I have a specific post I made a while back that made this really obvious to me. It got like 200 upvotes, but when I went back to review it, I discovered that I had done something wrong, and pretty much all my information was bad.

Many people pointed it out in the comments, but in cases like these it feels like there’s a disconnect in upvotes and comments. Commenters are the minority, most people don’t even vote, but many more people up/downvote instead of commenting, and those groups can be independent of each other.

I left the post up, because hopefully people finding it on Google will be smart enough to glance at the comments on it, but it’s still alarming how much stuff like this happens all the time to users other than me. It’s the same problem we have with AI these days where people will sometimes believe lies it tells you because of the level of confidence they display.

Good reply, I do agree with most of your points.

I haven’t contested the idea that my content doesn’t “deserve” to be downvoted, obviously it does when it is because every user’s ability vote is weighed equally and I’m just witnessing the result of that. If I wanted to I could try to make a stink about bot downvotes but I think on most subreddits it’s a non-issue.

I think my larger problem mainly comes from the broad community of so many of these different groups online. I think in many cases the communities (or in some cases mods) are so toxic or insular that expecting anything to get upvotes (or exposure to broader audiences) is like expecting a den of goblins to spontaneously accept me as their friend. Why would goblins ever do that?

I’m not even really commenting on subs I’ve posted on specifically, there’s a bunch of subs I straight up never post to because I know with certainty that I’m just not welcome because I have no interest in conforming to their culture.

It’s within their right and I don’t expect rules that would change that (if it were even possible), but I’ve definitely noticed a shift in the average user’s tone and personality over the years I’ve used this website. It’s just frustrating that this website is starting to become (in my opinion) a slightly worse place to be year-by-year, and since it’s fundamentally replaced online forums, there’s even more pressure to conform.

There are alternatives such as discord, but they’re not the same, it’s too hard to do direct comparisons. Outside of discord and Reddit though, and maybe Facebook groups, sometimes it feels like you can’t find a healthy place to talk about a specific subject.

I’m not even convinced this is a Reddit-specific problem, this could honestly just be a slow cultural shift as the result of many different factors. I really don’t think Reddit is solely the source of the problems I speak of, but I do believe it plays a small role in it.

Tl;dr many communities have people whom I don’t respect and whom don’t respect me. It’s frustrating but I’m learning to accept adversity and stigma.

Is there a good reason for downvoted posts being able to subtract karma from the poster’s account, beyond the original post?

You can take a look at my profile if you’re curious what I’ve been up to, but long story short I’ve had some opinion-based posts and getting downvoted on many of them, big surprise.

Personally, I actually don’t care very much about getting downvoted. It’s a little frustrating that my posts won’t get more engagement because of said downvotes, but for me this is just a minor annoyance since I honestly just expect everything to get downvotes by default. I’m usually just looking for conversations or information, basically the only reason I ever post anything.

What concerns me is that with the way Reddit is set up, I feel like this system biases basically every post you see that gets any upvotes at all. Being able to essentially attack a person’s account from any of their posts is a feature exclusive to Reddit, no other forum I’ve ever used does that.

Ideally I’d want Reddit set up so that, if someone gets downvoted to hell, they might just leave the post up because people finding it later on Google or whatever might think it’s interesting. The fact that one really bad post could result in a karma bomb on your account probably discourages a lot of people from posting on certain things.

I feel like a ton of people sensor themselves purely because of the karma system. I think deleting a post because you’re embarrassed by the results is perfectly normal and human, but to me Reddit’s system has always felt a little weird because of how much it guides your hand, even if you don’t notice it doing so.

The result is that most of the conversational posts we see are extreme opinions that lack nuance, or feature a distinct lack of disagreeable opinions. This results in many subreddits just feeling like echo chambers, which I’m not into. When I see opinions I disagree with, oftentimes I want to engage with that person to see why they feel that way, I don’t want to just delete them entirely because I disagree or whatever.

There are exceptions like r/unpopularopinions , but besides these niche cases you pretty much have to conform to expectations or you are passively informed that your content is unwelcome and that you shouldn’t exist.

I’m happy I don’t suffer from Reddit-induced anxiety, but I know for certainty a ton of people do for this very reason.

Turbopasta
OP
-4Edited

My stance is that, if something like this was going to be implemented, it should only be used in cases where the original actors can consent to their voices being used in this way. Otherwise the only usage for the original voices should be whatever the actor originally agreed to, which wouldn't include AI voice sampling.

Only people who are alive can consent to their likeness being used in this way, and even in the cases of those still living, I'd be in support of people being able to say no to their voice being used in this way as well, assuming their contract didn't stipulate anything about this usage. Consent is everything here.

Edit: lmao why is this getting downvoted, do you guys hate consent? Kinda cringe ngl

How would you feel if Square Enix started using AI to create new voice acting for previously text-only conversations and cutscenes in ffxiv?[Discussion]

This is purely a hypothetical question, as the devs have made no announcement of any intent to do this. I'm just curious what the community would think since AI tends to be such a divisive topic.

Personally, I consider myself mostly anti-AI, with the exceptions of very specific use cases, and used only as a supplementary tool, with complete legal permission. My mostly negative viewpoint stems from the fact that many are abusing the power that comes with using AI tools, and in many cases it makes the product that they were used on worse than before. Many people involved in AI are also trying to make quick money before the bubble bursts, which leads to bad leadership and bad integration. I like the idea of the technology, but I don't like the lack of regulations and accountability that currently comes with it.

With all that said, I was thinking about practical ways AI might be used in an MMO like ffxiv, and I actually think even as someone who usually opposes AI that I would welcome this addition (with caution, but more on that later).

Much of the ffxiv experience (especially the older parts such as ARR) feel antiquated and lack polish after all this time. The devs have made attempts to streamline parts of it, but even after those changes it's hard find to reasons for the devs to polish older content when the newer content demands so much of their attention.

Older cutscenes especially have many moments where important things are being said, but there is no voice acting. Voice acting in a cutscene is usually a sign that the scene is very important and that the player should pay attention, but this has the side effect of making the player value these voiceless cutscenes less.

(However, you could also make the argument that players can't be expected to retain ALL the information of a given story/expansion/whatever, and by putting voices into specific cutscenes it informs players which scenes they should make an effort to remember in order to retain most of the important information. This is a valid criticism of my idea and honestly I'm not sure where I stand on it yet)

I feel confident in assuming that the reason many sections don't have voice acting is because of financial/logistical reasons, not because of artistic ones. Which is to say, I don't think they intentionally want sections without voice acting because they think it adds more to the game, quite the opposite. But it's a compromise that has to be made because this game is such a large production as-is. Even getting partially voice-acted cutscenes in ffxiv is a lot better than getting none at all, in my opinion.

AI-supplemented voices wouldn't be a perfect solution, but they might be valuable enough to the point where many people felt like it adds significant value to the experience. If all cutscenes and conversations had voice-acting, player retention would be higher, and overall enjoyment would be higher as well. In execution the AI would probably do it's best to duplicate the voices of whatever character is speaking, as well as making an educated guess what a previously voiceless character would sound like.

As great as the idea of more high-quality voice acting can be, I can also acknowledge the potential for negative results. Adding AI voices to the game could potentially devalue *all* cutscenes, because you might find it harder to connect to a character's performance if you think there's a chance it could have been AI generated, which is totally fair. For this reason I would propose either making this feature optional, and/or prefacing a cutscene/convo by telling the player that the audio is AI-generated.

It would take development time and money to integrate as well, but for the sake of the hypothetical I'm ignoring this factor, I'm mostly just pondering the ethics of the idea. I'm also ignoring the complexity that comes from the fact that not everyone plays with English voice acting. Hypothetically you could do this across all supported languages, but that would obviously require more time and money.

I can also acknowledge that doing something like this could potentially have a negative snowball effect, where in the future Square Enix (and companies like them) are more tempted to just generate AI voices instead of hire real voice actors, going forward with future content. I would feel a lot better about this idea if I somehow knew that they wouldn't do this, but realistically if the quality was good enough I think there's a strong chance they would try it and see if they could get away with it.

Did I miss something here? This whole ramble is basically me posing a thought experiment. I'm not 100% on board with this idea, and like I said up front I'm mostly just curious what everyone's thoughts on it would be. I'm open to ideas and concerns, feel free to let me know if I overlooked something or if you have anything to say.

View Poll

Playing as Neon is like being in the driver’s seat of a car accidentDiscussion

But in a good way, if that makes any sense.

I just picked up this agent recently on a whim. Everyone knows Neon is fast, but somehow she feels so much faster when you’re the one in control. There’s this huge sense of whiplash when every other agent in the game moves extremely slow in comparison. The difference is like going from a skateboard to a jet ski.

I played some games with her (in normals, I’m not a psycho) and most of my rounds were complete disasters, but not all of them. Grab the Bucky, run straight into 3 people, kill one of them and die. Or run around the whole map, flank the entire enemy team before anyone can turn around. Or just run out and die instantly because they shot bullets into my wall.

I watched some pro vods of Neon players after. Obviously they’re much better than me, but they also play Neon so differently. Everyone expects the “YouTube shorts Neon montage” style of play where you’re just slipping and sliding like crazy, but many of these pros actually take their time with this agent and don’t just rush into everything headfirst. Crazy, I know.

Very fun agent, but it’s also kind of exhausting trying to learn her. I feel like my fundamentals could get worse over time if I focus too much on the movement too. Any Neon fans?

Propose an awful change for Valorant that could realistically happenDiscussion

New update that hides every player’s KDA, to prevent toxic players from harassing their team’s bottom fragger

If it wasn’t negatively affecting active players I’m sure the community wouldn’t care what they do. But these bots are getting progressively worse and making an otherwise fun game feel boring and pointless. Why play a multiplayer game at all if you’re the only one playing it?

SD is clearly the lesser of two evils here, I’d rather take their solution for this than try to coexist with this bot cancer

What’s to stop them from setting up a system that monitors players that retreat extremely often on turn 3 specifically? And then punishing or restricting players who continue to do this? Even if it forced them to retreat turn 4 instead for a while that would still be forward progress. And I can’t imagine you’d get many false positives since these people are just spamming this strategy so rapidly

VoiceMeeter banana randomly cutting out on very quiet ambient audio?Help (VoiceMeeter Banana)

I noticed when I play video games that have quiet moments with very quiet ambient sound, my speakers (headphones) will do this weird thing where, lets say we have audio that starts from no sound, becomes louder, then gets quiet again. I won't hear the very beginning and end of those sounds at all, but it'll suddenly click on at some volume above 0, maybe around a 5 out of 100 for example.

I tested my headphones outside of VoiceMeeter and they're fine, this seems to be an issue isolated to VM. Curiously, after testing my headphones separately and then reloading VM banana, the issue is gone, but I'd like to know what's causing this in the first place since this will probably happen again next time I boot up my computer. It's very annoying playing games where audio is very important, like multiplayer shooters, and not being able to hear very quiet noises sometimes.

I was doing some light research as I wrote this post, one thing I thought was interesting is that according to data, younger children read quite a lot more compared to once they start to hit their teens. Which also makes sense since they know it's important to learn how to read at some point, and reading for practice or fun is the best method we have for learning. I guess at a certain point reading stops being cool or something.

Kids are reading less today than ever before, right?Non-US Teacher

I’m not a teacher but I’ve heard this opinion from multiple places in the last few years especially. Do we have good data that proves this?

Anecdotally I talk in groups on discord, and I’ve noticed in groups with demographics that skew younger, maybe older teens on average, people HATE reading longer comments. What’s wild is that I’m usually trying really hard to condense what I say and be as deliberate as possible. I can’t even type 3 2-sentence paragraphs without getting a “blud is yapping (skull emoji)“ back

Things definitely didn’t used to be like this. I’ve been around a while now, and I’m seeing “TL;DR” on comments that I’ll make now on posts which take like, 10 seconds at most to read. It legitimately makes me wonder if these people are just taking much longer to read through my messages than I would.

Meanwhile I posted a wall of text to a discord that skews older/more intellectual, and people actually responded properly, even though I thought it might be too long. But I feel like as I age those people are going to start disappearing.

I know that the cliche of the older generation worrying about the future is nothing new, but it seems especially dire right now, no? Is it not worth worrying about when we have quantifiable proof that things are only getting worse?

Turbopasta
-10Edited
11dLink

8 months of Iso being garbage, 2 weeks of Iso being way too strong, here’s to the future of Iso being possibly even weaker now than before due to only one charge of double tap for the entire round if you don’t get that first kill within 12s.

Also his shields are still way too loud. Also the rest of his kit is still too niche and unreliable and rivals Harbor wall and deadlock sensor for worst abilities in the game.

Another weird thing nobody mentioned yet is that they kind of removed A, B and C callouts as well. They’re still in the game but you have to go through the text menu to get them now.

Funnily enough you can still select “be quiet” in those menus but it just makes the agent say “let’s play slow” instead. I’m not sure Riot remembers that these menus exist

You're actually right, I completely forgot about Harbor lmao. my B. I honestly forgot his util slows his own team as well, crazy

"really good" is pushing it, but it's at least pretty decent on some maps. It's great for pushing up Lotus A site, or Icebox B site for example.

But in any situation where you need to cover multiple angles, like pushing site from Bind showers, or entry on either site on Breeze (rip Breeze) it starts to fall apart, especially if it's treated as the solo entry.

I 100% agree that the rest of his kit is a problem that needs to be fixed by Riot at some point. Somehow Iso has access to not one, but TWO of the worst abilities in the game currently as his other parts of utility. Before the patch doubletap wasn't SO good to the point where the rest of his kit had to be trash, many people preferred Iso's vuln over doubletap for varying reasons.

I'm pretty sure Riot just doesn't know what they want Iso's identity to be beyond doubletap and his ultimate so they just sort of phoned it in. In the absense of a real rework on the rest of his kit I think the best they can do is try to find a comfortable spot for doubletap for now

I sort of agree, but for the Skye comparison to check out, all 3 of her abilities would need to roughly add up to the value that 1 ability (Iso's doubletap) gives their team. I'd argue Skye is infinitely more useful than a one-doubletap Iso because she can whiff with 2 of her abilities and still get value from the third in many cases. But this is also oversimplifying since Skye has to spend more credits per round for her kit than Iso will.

I also don't think comparing duelists to non-duelists is a good idea. They obviously aren't meant to bring the same kind of tools to the fight.

At this point I'm convinced his wall is actually a value-negative ability since using it makes you think you'll be safe, but inevitably you're still exposed from a billion other angles and you just die on entry most of the time anyways. In almost all situations a flash would be far more useful for entry, but in Iso's case it would probably be too much.

I think part of it is that when people say that they want food because they feel "hungry", that isn't the whole story. They definitely want to stop feeling hungry, but often I think they also want that little dopamine hit that excessive sugar can give them. You don't need that dopamine to survive, but when the brain learns it's a very good thing over time it can be hard to tell the difference.

If I'm hungry, the idea of bulking up on Broccoli alone actually sounds like it wouldn't work. But that's not true, it absolutely would make me feel full because it's a ton of food. But emotionally, my brain is telling me it wouldn't work, and that's the issue.

FWIW I'm in relatively good shape as is, but I can certainly eat better and exercise more than I already do. This isn't a huge issue for me personally right now but I enjoy examining thoughts like these.

That's a decent way to look at it. Like if you just took a coke and froze it, it wouldn't be more calories than the food option because it's now more "solid mass" than the food option. It's still the same calories as before, but it appears as a larger mass somehow.

Which makes sense because it's just surrounded by zero calorie water that's diluting the, uh, essence of the drink, so to speak

I think mentally I just have trouble imagining how liquids can be just as dense as solids, when it comes to how your body processes it. Clearly I'm not the only one that feels this way since soda companies have been exploiting this perspective for a while.